Systems Site Reliability Engineer

Toronto, ON, Canada

Job Description


Job Summary



RBC Digital is a new kind of business \xe2\x80\x93 one that marries the strength of one of the world\xe2\x80\x99s most trusted and successful financial institutions with a mission to reimagine the role we play in people\xe2\x80\x99s lives \xe2\x80\x93 to move RBC \xe2\x80\x98beyond banking\xe2\x80\x99. We\xe2\x80\x99re building a world-class organization focused on designing exceptional experiences, exploring new business models and creating exponential value. In 2021, we were named as one of the 100 Best Workplaces for Innovators and we\xe2\x80\x99re looking for team members with the curiosity to explore, the capabilities to build and capacity to help us deliver our mission.

What is the opportunity?

We are looking to expand our Direct Investing Support team at RBC. If you are looking for an exciting, high growth opportunity with a leading financial institution that is accelerating cloud native development this could be the job for you. Are you looking for a chance to make a difference? Are you someone who embraces change?

We are looking for a Site Reliability Engineer who exemplifies the attributes of a leader, mentor and decision maker. We are currently building out our SRE team, with the goal being to provide expertise and tooling to the Direct Investing application teams to manage the health, security, and availability of our application in production. We work with other teams to provide guidance throughout the lifecycle of building, deploying, and operating the application.

What will you do?

Run the production environment by monitoring availability and taking a holistic view of system health

Build tools to manage platform infrastructure and applications

Debug production issues across services and levels of the stack and provide primary operational support and engineering for multiple large distributed software applications

Help adopt and drive the tool creation for application health monitoring and alerting.

Improve reliability, quality, and time-to-market of our suite of software solutions

Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of application team needs, and innovating to continually improve

Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding.

Partner with development teams to improve services through rigorous testing and release procedures.

Participate in system design consulting, platform management, and capacity planning.

Create sustainable systems and services through automation and uplifts.

Balance feature development speed and reliability with well-defined service level objectives.

Technical Leadership

Provide SRE thought leadership on the squad level

Perform code and non-functional (performance, security, maintainability) reviews of all production-bound SRE solutions

Help drive transformation by continuously looking for ways to automate existing processes

Run engineering mindset meetups accelerating breadth and depth of knowledge in the community

Manage application assets and users (virtual machines, cloud instances, source code repositories, etc.)

Publish technical design for SRE solutions

Publish and/or review implementation plans for SRE solutions bound to production

Explore new capabilities and technologies to drive innovation (including coding and publishing how-to documentation)

Production Support + Development

Perform production support role, including off-hours support

Assist in incident management and problem management for applications in scope

Evaluate continuously \xe2\x80\x93 what went well, what went wrong, what can be done to improve and prevent in future

Maintain technology currency (perform server patching, certificate renewal, etc.) with a keen eye on automating opportunities

Ensure availability and uptime of applications in scope, as per service level objectives

Ensure compliance with all systems and applications in scope, including maintaining segregation of duties

What Do You Need To Succeed?
Must have:

Overall, 4-6 years of support experience in Openshift, Azure & Kubernetes.

2-3 years of experience as an SRE supporting multiple applications

Have a very strong programming skills .Net/JavaScript/Python

SQL database operational experience in the cloud/on-premise and writing/understanding database queries (SQL and/or No-SQL)

Object Oriented design and development

Exposure to UCD, PCF (Pivotal Cloud Foundry), and GitHub is desirable

Having a good overall understanding of networking-related areas like certificates, load balancers etc.

Monitoring using Splunk, Dynatrace, RUM, Grafana & other related tools

Experience with the operational aspects of software systems such as monitoring, centralized logging, and alerting.

Experience in micro-services, public cloud (Azure preferred) & container technologies

Working knowledge of Mainframes & JCL is nice to have

Nice-to-have:

Experience with Airflow, Jupyter Notebooks, and other MLOps technologies

Knowledge of public cloud (Microsoft Azure and AWS) and private cloud (OpenShift) platforms and development of applications in multi-cloud, hybrid environments

Knowledge of containers and orchestration (e.g: Docker, Kubernetes)

What\xe2\x80\x99s in it for you?
We thrive on the challenge to be our best, progressive thinking to keep growing, and working together to deliver trusted advice to help our clients thrive and communities prosper. We care about each other, reaching our potential, making a difference to our communities, and achieving success that is mutual.

A comprehensive Total Rewards Program including bonuses and flexible benefits, competitive compensation, commissions, and stock where applicable

Leaders who support your development through coaching and managing opportunities

Ability to make a difference and lasting impact

Work in a dynamic, collaborative, progressive, and high-performing team

A world-class training program in financial services

Flexible work/life balance options

Opportunities to do challenging work

Job Skills Agile Methodology, IT Systems Integration, Organizational Leadership, Problem Solving, Software Development Life Cycle (SDLC), Software Engineering, System Applications, System Integration Testing (SIT), Systems Software

Additional Job Details

Address: RBC WATERPARK PLACE, 88 QUEENS QUAY W:TORONTO

City: TORONTO

Country: Canada

Work hours/week: 37.5

Employment Type: Full time

Platform: Technology and Operations

Job Type: Regular

Pay Type: Salaried

Posted Date: 2023-12-22

Application Deadline: 2023-12-29

Inclusion and Equal Opportunity Employment

At RBC, we embrace diversity and inclusion for innovation and growth. We are committed to building inclusive teams and an equitable workplace for our employees to bring their true selves to work. We are taking actions to tackle issues of inequity and systemic bias to support our diverse talent, clients and communities.

We also strive to provide an accessible candidate experience for our prospective employees with different abilities. Please let us know if you need any accommodations during the recruitment process.

Join our Talent Community

Stay in-the-know about great career opportunities at RBC. Sign up and get customized info on our latest jobs, career tips and Recruitment events that matter to you.

Expand your limits and create a new future together at RBC. Find out how we use our passion and drive to enhance the well-being of our clients and communities at .

Royal Bank of Canada

Beware of fraud agents! do not pay money to get a job

MNCJobz.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Related Jobs

Job Detail

  • Job Id
    JD2269365
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Full Time
  • Salary:
    Not mentioned
  • Employment Status
    Permanent
  • Job Location
    Toronto, ON, Canada
  • Education
    Not mentioned