Senior AI Infrastructure Engineer (Video AI)

dada consultants

Singapore

3-5 Years

Save

Posted 7 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

Our client is a fast-growing technology company at the forefront of AI innovation, building large-scale multimodal and video intelligence products used by millions of users globally. They are seeking a Senior AI Infrastructure Engineer to design and scale the infrastructure powering next-generation AI models across video understanding, recommendation systems, generative AI, content moderation, and multimodal foundation models.

This is an exciting opportunity to work alongside world-class AI researchers and engineers, solving complex infrastructure challenges at scale while driving the future of video AI technologies.

Job Responsibilities

Design, build, and maintain scalable infrastructure supporting distributed AI model training and inference.
Optimize GPU clusters, compute resources, and system performance for large-scale AI workloads.
Develop and enhance multimodal and video AI training pipelines to improve efficiency and scalability.
Improve platform reliability, observability, fault tolerance, and deployment processes.
Partner closely with AI researchers and machine learning teams to accelerate experimentation and model development cycles.
Build infrastructure tooling for model serving, evaluation, scheduling, orchestration, and resource management.
Design and optimize data storage, processing, and throughput systems for large-scale video datasets.
Support the deployment and scaling of real-time AI inference services with high availability and low latency.
Implement CI/CD pipelines and automation frameworks to streamline the AI model lifecycle.
Evaluate and introduce emerging technologies related to distributed systems, GPU acceleration, and AI platform engineering.
Contribute to technical architecture decisions, engineering best practices, and mentor junior team members.

Job Requirements

Bachelor's or Master's degree in Computer Science, Software Engineering, Artificial Intelligence, or a related discipline.
3–5+ years of experience in infrastructure engineering, distributed systems, machine learning platforms, or related areas.
Strong understanding of distributed computing principles and large-scale system architecture.
Hands-on experience with Kubernetes, Docker, and cloud-native infrastructure environments.
Experience managing GPU clusters and optimizing AI compute resources.
Familiarity with distributed machine learning frameworks such as PyTorch Distributed, DeepSpeed, Ray, Horovod, or Megatron-LM.
Experience working with major cloud platforms such as AWS, GCP, or Azure.
Solid understanding of high-performance networking, storage systems, and large-scale data pipelines.
Experience building and scaling model serving platforms and online inference systems.
Strong troubleshooting, performance tuning, and systems optimization capabilities.

www.dadaconsultants.com

Licence Number: 18S9037EA

Registration Number: R23112003

Business Registration Number: 201735941W