. This candidate will be a key player to provide implement SRE practices focused on observability, event correlation, AIOps, chaos engineering, automation. Candidate will work at the intersection of development and operations, ensuring high availability, scalability, and performance of systems in scope.
Required Qualifications:
Candidate must be located within commuting distance of Calgary or be willing to relocate to the area. This position may require travel.
Bachelor's degree or foreign equivalent required from an accredited institution. Will also consider three years of progressive experience in the specialty in lieu of every year of education.
At least 7 years of Information Technology experience.
Candidates authorized to work for any employer in Canada without employer-based visa sponsorship are welcome to apply. Infosys is unable to provide immigration sponsorship for this role at this time.
Skills required:
At least 6 years of Site reliability engineering (SRE) experience in large programs with focus on architecting and implementing observability, automation across the entire lifecycle of operations.
Observability & Monitoring: Implement logging, monitoring, and alerting using any one of Dynatrace, Datadog, Splunk, Nagios, Prometheus, Grafana, ELK stack, or New Relic.
Analyze monitoring data/ golden signals to identify trends and patterns and proactively address potential problems.
Engagement to debug, optimize code, and automate routine operational tasks
Improve automation and increase the system's self-healing capability
Incident Management: participate in production incidents, perform root cause analysis (RCA), and drive post-mortem improvements.
Develop and maintain dashboards and reports to visualize system health and performance.
Use various technologies such as: ansible, Python, terraform, Powershell/Shell, JSON, create automation to reduce toil in operations
Develop automation solutions for repeated incidents/ service tasks for provisioning, scaling, backup, performance management, security, capacity mgmt etc. for infrastructure operations - Or - Develop automation/optimization solutions for repeated tickets/ signals on application operations
All applicants authorized to work in the United States are encouraged to apply
Preferred Qualifications:
Working Knowledge of:
Troubleshooting and providing speedy solution in case of failure of the database.
SLI, SLO, error budgets.
Event correlation, AIOps with deep understanding of ITSM tools
Microservices architecture with API's and REST API's
CICD tooling and best practices
Cloud platforms such as AWS, Azure, and Google
Container orchestration and practices, including Kubernetes, Docker Swarm
Infrastructure automation tools like Terraform, Cloud Formation, Ansible, and Puppet (Any one)
Scripting Languages: any of the following: Python, JSON, Java, Node.JS, PHP, PowerShell(M) or Bash/Shell/Perl
ITSM tools such as: ServiceNow
Excellent Communications and client interaction skills along with exceptional written and verbal skills as well as technical documentation
Extraordinary Planning, Project Management, Coordination, and Analytical skills
Hands-on experience in working in Global Delivery Model with onsite/offshore resources
Exceptional Organizational Skills
Ability to manage and prioritize tasks efficiently
Readiness to demonstrate a proactive attitude
Solid attention to detail and excellent written and verbal communication skills are required
Ability to work in team in diverse/ multiple stakeholder environment
The job entails sitting as well as working at a computer for extended periods of time. Should be able to communicate by telephone, email or face to face. Travel may be required as per the job requirements.
About Us
Infosys is a global leader in next-generation digital services and consulting. We enable clients in more than 50 countries to navigate their digital transformation. With over four decades of experience in managing the systems and workings of global enterprises, we expertly steer our clients through their digital journey. We do it by enabling the enterprise with an AI-powered core that helps prioritize the execution of change. We also empower the business with agile digital at scale to deliver unprecedented levels of performance and customer delight. Our always-on learning agenda drives their continuous improvement through building and transferring digital skills, expertise, and ideas from our innovation ecosystem.
Beware of fraud agents! do not pay money to get a job
MNCJobz.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.