As a Senior ML Engineer, on our Data Science Enablement team, you will be responsible for driving the design, development, and maintenance of the platforms and tools used for building, training, and deploying artificial intelligence (AI) and machine learning (ML) models at scale. Your responsibilities will include setting technical direction, mentoring team members, and ensuring the robustness, scalability, and performance of our AI/ML systems and infrastructure. You will work closely with cross-functional teams to align infrastructure capabilities with organizational goals, enabling data scientists and machine learning engineers to deliver impactful solutions efficiently and effectively.
Key Responsibilities:
ML System Development: Design and implement scalable and reliable AI/ML infrastructure, including data and ML pipelines, model development environments, and CI/CD.
Tool Integration: Integrate various AI/ML tools and frameworks into our existing systems, ensuring seamless operation and compatibility.
Cloud Management: Manage cloud-based resources and services for AI/ML workloads, including configuring and optimizing cloud environments to ensure cost-efficiency and performance.
Performance Optimization: Monitor and optimize model performance, identifying and addressing deterioration to ensure high-quality predictions.
Automation: Develop and maintain automation scripts and workflows to streamline model deployment, monitoring, and maintenance processes.
Collaboration: Partner with data scientists and other stakeholders to deeply understand their needs and proactively provide innovative solutions to enhance productivity and model performance.
Mentorship: Act as a mentor to team members and data scientists, guiding best practices for ML workflows at Lightspeed.
Security and Compliance: Implement and enforce best practices for data security, model governance, and compliance with relevant regulations and standards.
Documentation: Create and maintain comprehensive documentation for AI/ML systems, processes, and workflows.
Research: Research & implement industry best practices for AI/ML systems, as appropriate for use cases.
Troubleshooting: Identify and resolve technical issues related to the AI/ML systems, providing support and solutions to ensure minimal disruption.
Qualifications:
Education: Bachelor's or Master's degree in Computer Science, Engineering, Data Science, or a related field.
Experience: 5-7 years of experience in machine learning engineering, software engineering, or a related role with a focus on AI/ML systems.
Technical Skills:
+ Extensive experience with AI/ML frameworks and tools such as TensorFlow, PyTorch, Scikit-Learn, etc.
+ Advanced proficiency in programming languages such as Python, Java, or Scala, with a focus on applying them in AI/ML contexts.
+ Hands-on experience deploying and optimizing ML models on cloud platforms, leveraging their AI/ML services.
+ Experience with Kubernetes (K8s) or similar container orchestration platforms, with an understanding of managing or interacting with cloud solutions
+ Understanding of MLOps practices and CI/CD workflows for ML systems
+ Solid knowledge of data engineering concepts and tools, including data pipelines and ETL processes
Soft Skills:
+ Exceptional problem-solving abilities and analytical thinking.
+ Strong communication skills including the ability to work collaboratively in a team environment, mentor team members and collaborate effectively with cross-functional teams.
+ Demonstrated ability to prioritize and manage multiple high-impact projects, balancing short-term objectives with long-term goals.
Preferred Qualifications:
Experience with MLOps practices and tools.
Knowledge of AI/ML model lifecycle management and monitoring.
Familiarity with infrastructure-as-code tools (e.g., Terraform, Ansible).
MNCJobz.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.