Principal / Staff AI Systems Architect (LLM, Search, RAG & High-Efficiency Computing)

gk consulting pte. ltd.

Orchard Road, Singapore

5-12 Years

SGD 12,000 - 18,000 per month

Save

Posted 19 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

Company Overview

Our client is a leading global technology organization investing heavily in next-generation AI capabilities across large language models, search and recommendation systems, AI agents, and high-performance computing infrastructure. As part of its continued expansion in Singapore, the company is looking to appoint multiple Principal / Staff AI Systems Architects to join its growing team.

Job Description

Reporting to senior technical leadership, you will play a critical role in designing and optimizing end-to-end AI systems across the full stack, from model architecture and inference optimization to distributed systems and large-scale production deployment.

You will focus on building high-efficiency AI computing systems, improving performance, scalability, and cost efficiency of large-scale AI workloads in real-world environments.

Key Responsibilities

1. AI Systems Architecture & Optimization

Design and optimize large-scale AI systems across model, system, and infrastructure layers
Improve model inference efficiency through techniques such as quantization, sparsity, KV cache optimization, and distributed parallelism
Ensure low latency, high throughput, and cost-efficient deployment in production environments

2. High-Efficiency AI Computing

Develop and optimize large-scale inference systems for LLMs and multimodal models
Work on GPU acceleration, CUDA optimization, and distributed computing frameworks
Drive performance improvements across compute, memory, and networking layers

3. Search, Recommendation & RAG Systems

Build scalable AI-powered search and recommendation platforms
Develop retrieval-augmented generation systems with strong grounding and hallucination control
Enable multilingual and cross-regional search experiences
Support personalization and monetization use cases such as recommendation and advertising

4. AI Agent Systems & Security

Contribute to AI agent system design and orchestration
Implement safeguards including prompt injection prevention, access control, and system robustness
Ensure secure and reliable deployment of AI systems

5. End-to-End System Ownership

Drive technical initiatives from design to production deployment
Collaborate with cross-functional teams across global engineering and product organizations
Deliver measurable impact on large-scale production systems

Requirements

Qualifications & Experience

Bachelor's, Master's, or PhD in Computer Science, Engineering, or related field
5-12+ years of experience in AI systems, machine learning infrastructure, or distributed computing
Strong experience working with large-scale production systems

Technical Skills

Strong expertise in LLMs or multimodal models
Hands-on experience in inference optimization or high-efficiency computing
Solid programming skills in Python and/or C++ (CUDA experience preferred)
Experience with modern AI frameworks and distributed systems

Preferred Experience

Experience in search, recommendation systems, or RAG architectures
Familiarity with AI agent systems and orchestration frameworks
Exposure to large-scale infrastructure such as GPU clusters or cloud-based AI platforms

Key Competencies

Strong systems thinking across both low-level optimization and high-level architecture
Ability to work independently as a senior individual contributor
Comfortable operating in a fast-paced, global environment
Strong problem-solving and technical leadership capabilities