AI Orchestration Architect & Engineering

niche recruitment consultants pte. ltd.

Singapore

6-12 Years

SGD 7,000 - 15,000 per month

Save

Posted 22 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

1. Architecture & Platform (Frameworks)

Design an orchestration platform enabling consistent use of LLMs (invocation, routing, retries, fallbacks, caching, batching, cost controls).
Create reusable libraries/SDKs for prompts, tools (LangChain Tools / SK Functions), memory, evaluators, tracing, and policy enforcement.
Establish standards for prompt management & versioning, evaluation harnesses, and model lifecycle governance (routing, rollback, deprecation).
Define reference architectures for common patterns: chat + RAG, structured output, tool-enabled agents, doc intelligence, and workflow automation.

2. Agentic Patterns & Tooling

Implement robust multi-agent patterns (planner/executor, supervisor/worker, critic/reviewer, self-reflective loops).
Build tool-use integrations for APIs, databases, web search, code execution, internal services ensure idempotent contracts and safe tool invocation.
Design long-horizon planning with memory, state persistence, and interruption/resume semantics.

3. Retrieval-Augmented Generation (RAG)

Architect RAG pipelines with hybrid search, chunking, metadata filtering, re-ranking, and retrieval routing.
Integrate vector search (e.g., Azure AI Search, Pinecone, Weaviate, FAISS) and document preprocessing (parsing, dedup, PII scrubbing, quality gates).
Measure and improve factual grounding (hallucination reduction, answerability, coverage

4. Security, Safety & Compliance

Implement guardrails (prompt injection defense, output filtering, jailbreak detection, PII redaction, policy checks).
Enforce RBAC, secrets management, audit logs, and data governance aligned with organizational policies (e.g., PDPA/MAS TRM/GDPR where applicable).
Build human-in-the-loop (HITL) mechanisms for review, escalation, and feedback capture.

5. Reliability, Observability & Cost

Set SLOs/SLA for latency, reliability, and cost-per-task design token accounting and budget caps.
Implement tracing/telemetry (OpenTelemetry), structured logs, dashboards (Grafana), and A/B testing for prompts/models/routing.
Optimize performance & cost (caching, prompt compression, response truncation, adaptive model selection).

6. Delivery & Collaboration

Partner with product, security, and data teams to deliver production workflows end-to-end.
Lead design reviews, technical docs, and internal enablement (playbooks, templates, starter kits).
Mentor engineers uplift engineering practices for LLM apps and orchestration.

Bachelor's in Computer Science, Software/Computer Engineering, Data/AI, Information Systems, or equivalent.

6-10+ years in software or AI engineering 2+ years with LLM apps/orchestration.

Hands-on with LangChain and/or Semantic Kernel building production-grade chains/agents/tools.

Strong Python engineering CI/CD, testing, typed code, performance tuning.

Practical RAG experience and vector databases prompt engineering & structured output (JSON schemas).

Cloud experience (Azure preferred), containerization (Docker/K8s), and secure service integration.

Proven track record shipping reliable, observable, and cost-aware LLM solutions.