We're a small team that does big things, and we're looking for a junior to intermediate Site Reliability Developer to join us. This is a rare opportunity to learn how it all connects and master your craft: from colocated servers and network hardware to virtualization and cloud orchestration.The successful candidate will join our Site Reliability team and work closely with senior management to build and maintain commercial blockchain infrastructure while supporting our talented software development teams building cutting-edge products and open-source tooling.We value curiosity, collaboration, initiative, and results.
Your Responsibilities
-------------------------
Operate and improve our Linux-based production systems that support cryptographic signing infrastructure. This includes monitoring system health, responding to alerts, diagnosing issues, and helping ensure services remain reliable.
Manage infrastructure and system configuration through code using GitOps workflows, where changes are made via pull requests, reviews, and automation rather than manual server changes.
Automate the provisioning and configuration of infrastructure using Infrastructure as Code tools such as Terraform and Ansible, helping ensure systems are consistent, repeatable, and recoverable.
Build, maintain, and improve CI/CD pipelines (for example GitHub Actions or GitLab CI) to safely apply infrastructure and configuration changes from version control.
Deploy and manage containerized services using Docker, focusing on reliable, automated operation rather than ad-hoc processes.
Monitor services and infrastructure using metrics, logs, and dashboards, primarily with the Grafana stack. Help set up alerts, investigate incidents, and use data to improve system reliability and performance.
Participate in on-call rotations and incident response, working alongside experienced engineers to troubleshoot issues, restore service, and communicate clearly during incidents.
Assist with incident coordination using alerting and incident management platforms, and help document incidents and follow-up actions.
Contribute to post-incident reviews by identifying root causes and suggesting improvements to prevent future outages.
Assist with capacity planning and performance analysis to help systems scale safely as usage grows.
Work closely with software engineers, product teams, and clients to understand requirements and support reliable software releases by validating infrastructure changes in staging environments.
Take part in design reviews and code reviews, advocating for automation, reliability, and secure-by-design practices.
Work hands-on with bare metal servers and physical infrastructure, including installing hardware, racking equipment, troubleshooting failed components, and maintaining systems in a colocation environment.
Gain practical experience managing real-world networking infrastructure, including ISP-level connectivity and network devices such as Cisco and Mikrotik routers and switches.
Skills & Requirements
--------------------------
Comfortable working with Linux systems as a primary environment, including system administration tasks, service troubleshooting, reading logs, and working confidently from the command line.
Experience using Git as part of a daily workflow, with an understanding that infrastructure and configuration changes are made through version control and code review.
Hands-on experience with Ansible and/or Terraform, or strong practical exposure through work, labs, or personal projects.
Ability to read, understand, and modify infrastructure or configuration code written by others.
Familiarity with Docker and containerized applications.
Basic understanding of networking fundamentals such as IP addressing, DNS, routing, and firewalls.
Some experience writing scripts or small programs (for example Bash, Python, Go, or similar) to automate operational tasks.
Exposure to monitoring and alerting concepts, and some familiarity with being on-call or supporting production systems.
A methodical approach to debugging problems, including gathering evidence, reproducing issues, and working through solutions with teammates.
Curiosity, ownership, and a strong desire to learn new systems and technologies.
Education
A post-secondary degree or diploma in Computer Science, Engineering, or a related field is helpful but not required.
Equivalent practical experience demonstrated through real systems, labs, projects, or open-source contributions is valued.
Additional Experience We Value
----------------------------------
Experience participating in on-call rotations and responding to production incidents using tools such as PagerDuty, incident.io, or similar incident management platforms.
Familiarity with the Grafana stack for metrics, dashboards, and alerting.
Hands-on experience with bare metal servers, physical hardware, or home lab environments, including assembling systems or troubleshooting hardware failures.
Experience working with cloud platforms such as AWS or GCP.
Interest in or exposure to applied cryptography, including concepts such as key pairs, signing, hashing, and certificates.
Experience integrating or experimenting with Trusted Execution Environments.
Background or interest in security assessment, threat modeling, or penetration testing.
Contributions to open-source projects, especially in infrastructure, security, or blockchain-related areas.
Exposure to advanced cryptographic or privacy-preserving systems such as multi-party computation or zero-knowledge proofs.
Employee Benefits
---------------------
Extended health and dental benefits
Maternity/parental leave top-up benefits
Health spending account
Hybrid work environments (mix of home and ECAD Labs office in Vancouver)
Opportunities for professional development, including conferences, seminars, and educational courses
Location
------------
The successful candidate will reside in Vancouver and be available to attend our data centre and office on a regular basis. Hybrid work structures blending at-home and in-office work are available provided team members regularly attend in-person planning, collaboration, social, and other meetings and events in Vancouver.This is a full-time employment opportunity for residents of Canada, including permanent residents and those with an open Canadian work permit.ECAD Labs may place additional location restrictions based on the nature of the role and teams the candidate may work with.
Professional Titles & SR&ED Consideration
-----------------------------------------------
Use of the ' Engineer' title in British Columbia is limited to those who are members in good standing with the Engineers & Geoscientists of British Columbia. We think this is pretty silly, but that appears to be the law. Successful candidates who are not members in good standing with the Engineers & Geoscientists of British Columbia will be hired as Developers or Administrators instead of Engineers. Job duties and compensation packages are based on job duties, experience and results - not titles. Your day-to-day work--designing, testing, and validating cryptographic systems--will be fully documented and may support future SR&ED claims.
To apply
------------
Preferred: Submit a Resume and Cover Letter to the current job board.
Alternatively, send your application to yourcareer@ecadlabs.com;Put your cover letter text in the body of the email
* Attach your resume as a file
Beware of fraud agents! do not pay money to get a job
MNCJobz.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.