Company Overview
Our client is a leading global technology organization investing heavily in next-generation AI capabilities across large language models, search and recommendation systems, AI agents, and high-performance computing infrastructure. As part of its continued expansion in Singapore, the company is looking to appoint multiple Principal / Staff AI Systems Architects to join its growing team.
Job Description
Reporting to senior technical leadership, you will play a critical role in designing and optimizing end-to-end AI systems across the full stack, from model architecture and inference optimization to distributed systems and large-scale production deployment.
You will focus on building high-efficiency AI computing systems, improving performance, scalability, and cost efficiency of large-scale AI workloads in real-world environments.
Key Responsibilities
1. AI Systems Architecture & Optimization
- Design and optimize large-scale AI systems across model, system, and infrastructure layers
- Improve model inference efficiency through techniques such as quantization, sparsity, KV cache optimization, and distributed parallelism
- Ensure low latency, high throughput, and cost-efficient deployment in production environments
2. High-Efficiency AI Computing
- Develop and optimize large-scale inference systems for LLMs and multimodal models
- Work on GPU acceleration, CUDA optimization, and distributed computing frameworks
- Drive performance improvements across compute, memory, and networking layers
3. Search, Recommendation & RAG Systems
- Build scalable AI-powered search and recommendation platforms
- Develop retrieval-augmented generation systems with strong grounding and hallucination control
- Enable multilingual and cross-regional search experiences
- Support personalization and monetization use cases such as recommendation and advertising
4. AI Agent Systems & Security
- Contribute to AI agent system design and orchestration
- Implement safeguards including prompt injection prevention, access control, and system robustness
- Ensure secure and reliable deployment of AI systems
5. End-to-End System Ownership
- Drive technical initiatives from design to production deployment
- Collaborate with cross-functional teams across global engineering and product organizations
- Deliver measurable impact on large-scale production systems
Requirements
Qualifications & Experience
- Bachelor's, Master's, or PhD in Computer Science, Engineering, or related field
- 5-12+ years of experience in AI systems, machine learning infrastructure, or distributed computing
- Strong experience working with large-scale production systems
Technical Skills
- Strong expertise in LLMs or multimodal models
- Hands-on experience in inference optimization or high-efficiency computing
- Solid programming skills in Python and/or C++ (CUDA experience preferred)
- Experience with modern AI frameworks and distributed systems
Preferred Experience
- Experience in search, recommendation systems, or RAG architectures
- Familiarity with AI agent systems and orchestration frameworks
- Exposure to large-scale infrastructure such as GPU clusters or cloud-based AI platforms
Key Competencies
- Strong systems thinking across both low-level optimization and high-level architecture
- Ability to work independently as a senior individual contributor
- Comfortable operating in a fast-paced, global environment
- Strong problem-solving and technical leadership capabilities
Additional Information
- Opportunity to work on cutting-edge AI systems with global impact
- Exposure to large-scale real-world AI applications
- Collaborative and innovation-driven environment