About the Role
We are seeking a skilled Machine Learning Platform Engineer (MLOps) to join our agile platform team within our ML & AI Agile Release Train (ART).
In this role, you will bridge the gap between experimental data science and production-grade systems, contributing across the entire lifecycle-from concept to deployment. You will work closely with cross-functional teams to deliver scalable, reliable, and high-quality AI-driven solutions, while enabling advanced agentic workflows and autonomous AI systems.
Key Responsibilities
- Design, develop, and deploy machine learning solutions and services
- Build end-to-end ML pipelines (data ingestion, training, validation, deployment, serving)
- Operationalize Large Language Models (LLMs), embeddings, and multi-agent systems
- Manage the ML lifecycle: experimentation, model registry, versioning, and deployment
- Oversee model promotion workflows with validation gates and approvals
- Containerize applications using Docker and orchestrate via Kubernetes
- Develop and maintain CI/CD pipelines for ML and AI applications
- Collaborate with data scientists to productionize research code into robust Python services
- Monitor model performance, data drift, and system reliability in production
- Design and implement production-grade RAG (Retrieval-Augmented Generation) systems
- Integrate AI solutions into existing infrastructure and enterprise systems
- Participate in code reviews, testing, and debugging to ensure quality and reliability
Requirements
Education
- Bachelor's or Master's degree in Computer Science, Data Science, Mathematics, Statistics, or related field
Technical Skills
- Strong proficiency in Python (clean, efficient, testable code)
- Experience with ML lifecycle tools (e.g., MLflow or similar)
- Hands-on experience with LLMs, embeddings, and AI agent frameworks
- Solid understanding of ML concepts (feature engineering, model evaluation, optimization)
- Experience with Docker and Kubernetes (K8s)
- Familiarity with CI/CD tools (e.g., GitLab, Jenkins)
- Knowledge of GPU architecture and cloud compute optimization
- Experience designing scalable ML pipelines and production systems.
EA Number: 11C4879