Search by job, company or skills

dada consultants

Senior AI Infrastructure Engineer (Video AI)

3-5 Years
Save
  • Posted 7 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Our client is a fast-growing technology company at the forefront of AI innovation, building large-scale multimodal and video intelligence products used by millions of users globally. They are seeking a Senior AI Infrastructure Engineer to design and scale the infrastructure powering next-generation AI models across video understanding, recommendation systems, generative AI, content moderation, and multimodal foundation models.

This is an exciting opportunity to work alongside world-class AI researchers and engineers, solving complex infrastructure challenges at scale while driving the future of video AI technologies.

Job Responsibilities

  • Design, build, and maintain scalable infrastructure supporting distributed AI model training and inference.
  • Optimize GPU clusters, compute resources, and system performance for large-scale AI workloads.
  • Develop and enhance multimodal and video AI training pipelines to improve efficiency and scalability.
  • Improve platform reliability, observability, fault tolerance, and deployment processes.
  • Partner closely with AI researchers and machine learning teams to accelerate experimentation and model development cycles.
  • Build infrastructure tooling for model serving, evaluation, scheduling, orchestration, and resource management.
  • Design and optimize data storage, processing, and throughput systems for large-scale video datasets.
  • Support the deployment and scaling of real-time AI inference services with high availability and low latency.
  • Implement CI/CD pipelines and automation frameworks to streamline the AI model lifecycle.
  • Evaluate and introduce emerging technologies related to distributed systems, GPU acceleration, and AI platform engineering.
  • Contribute to technical architecture decisions, engineering best practices, and mentor junior team members.

Job Requirements

  • Bachelor's or Master's degree in Computer Science, Software Engineering, Artificial Intelligence, or a related discipline.
  • 3–5+ years of experience in infrastructure engineering, distributed systems, machine learning platforms, or related areas.
  • Strong understanding of distributed computing principles and large-scale system architecture.
  • Hands-on experience with Kubernetes, Docker, and cloud-native infrastructure environments.
  • Experience managing GPU clusters and optimizing AI compute resources.
  • Familiarity with distributed machine learning frameworks such as PyTorch Distributed, DeepSpeed, Ray, Horovod, or Megatron-LM.
  • Experience working with major cloud platforms such as AWS, GCP, or Azure.
  • Solid understanding of high-performance networking, storage systems, and large-scale data pipelines.
  • Experience building and scaling model serving platforms and online inference systems.
  • Strong troubleshooting, performance tuning, and systems optimization capabilities.

www.dadaconsultants.com

Licence Number: 18S9037EA

Registration Number: R23112003

Business Registration Number: 201735941W

More Info

Job Type:
Industry:
Function:
Employment Type:

About Company

Job ID: 148946223