Our client is a global technology consultancy known for its emphasis on agile software development, lean thinking, and promoting social and economic justice through tech.
What client is looking for
- Expertise in GPU-based infrastructure for AI (H100, GB200, or similar), including scaling across clusters.
- Strong knowledge of orchestration frameworks: Kubernetes, Ray, Slurm.
- Experience with inference-serving frameworks (vLLM, NVIDIA Triton, DeepSpeed).
- Proficiency in infrastructure automation (Terraform, Helm, CI/CD pipelines).
- Experience building resilient, high-throughput, low-latency systems for AI inference.Strong background in observability and monitoring: Prometheus, Grafana, OpenTelemetry.
- Familiarity with security, compliance, and governance concerns in AI infrastructure (data sovereignty, air-gapped deployments, audit logging).
- Solid understanding of DevOps, cloud-native architectures, and Infrastructure as Code.
- Exposure to multi-cloud and hybrid deployments (AWS, GCP, Azure, sovereign/private cloud).
- Experience with benchmarking and cost/performance tuning for AI systems.
- Background in MLOps or collaboration with ML teams on large-scale AI production systems.
Please refer to U3's Privacy Notice for Job Applicants/Seekers at When you apply, you voluntarily consent to the collection, use and disclosure of your personal data for recruitment/employment and related purposes.