Site Reliability Engineer

Richmond Hill, ON, CA, Canada

Job Description

Some of what you will do:



The Site Reliability Engineer (SRE) is responsible for ensuring the reliability, availability, and operational excellence of Staples Canada's digital platforms. This role supports production systems, develops automation for operations, enhances observability, and partners with engineering teams to improve performance and stability. This role involves working with all the Engineering team as a resource for them on deployment and reliability best practice. Being able to work with both legacy and modern processes will be key for this role as well as being able to work in with both on premise and cloud technologies.


Specifically, You Will:



Improve system reliability through monitoring and alerting, including during high traffic seasons. Collaborate with DevOps and engineering teams to optimize CI/CD pipelines. Support incident response, root-cause analysis, and post-mortem processes. Maintain high?availability Kubernetes and cloud-based infrastructure. Enhance logging, metrics, and distributed tracing for production systems. Support capacity planning, scaling, and performance tuning. Build automation to reduce manual operational tasks. Ensure operational readiness for new services and features.

Some of what you need:



Post-secondary education in Computer Science or equivalent working experience in software development. 2 years' experience in IT development and operations is required with retail sector preferred however equivalent experience in other sectors within a software development context will also be considered Microsoft Azure certification in either Solution Architecture or DevOps Experience with build and infrastructure tools (Docker, Kubernetes, Microsoft Azure, Terraform, Ansible) Experience with web server administration (IIS, NGINX, Tomcat) Experience with Azure DevOps, Github Actions or Jenkins build automation systems Familiarity with software and infrastructure monitoring tools 5+ years of site reliability engineering related experience 2+ years with APM tools such as Datadog, New Relic, Grafana etc. 2+ years of Kubernetes administration experience with both Linux and Windows containers. 2+ years of working with a public cloud (Microsoft Azure preferred, but AWS or GCP are acceptable) 3-5 years of software development life cycle experience. Effective process and procedure documentation management Business communication and analysis Effective written and verbal communication skills English/French bilingualism desirable, but not mandatory

Some of what you will get:



Associate discount Health and Dental benefits RRSP/DPSP Performance bonuses Learning & Development programs And more...

#Bringyourpassion


#LI-Hybrid



Staples Canada is an equal opportunity employer committed to diversity and inclusion and we encourage applications from all qualified candidates, including those with disabilities.

Beware of fraud agents! do not pay money to get a job

MNCJobz.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Related Jobs

Job Detail

  • Job Id
    JD3322076
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Full Time
  • Salary:
    Not mentioned
  • Employment Status
    Permanent
  • Job Location
    Richmond Hill, ON, CA, Canada
  • Education
    Not mentioned