AI Engineer

APAR TECHNOLOGIES PTE. LTD.

Shenton Way, Singapore

3-5 Years

SGD 5,500 - 6,500 per month

Save

Posted a month ago
Be among the first 10 applicants

Early Applicant

Job Description

Job Summary

We are looking for a skilled AI Engineer with 3+ years of experience to assist implementation of AI solutions. In this role, you will be responsible for the end-to-end lifecycle of LLM-based applications from configuring high-performance inference engines like vLLM to architecting advanced Agentic AI workflows. You will bridge the gap between raw model capabilities and project-specific business logic using RAG and CAG patterns.

Key Responsibilities

Configure and optimize vLLM and other inference frameworks to ensure low-latency, high-throughput model serving.
Design and implement RAG pipelines using vector databases and CAG strategies to minimize redundant computation.
Deploy and tune vLLM clusters to provide high-throughput, low-latency API endpoints for various open-source LLMs.
Design and maintain Apache Airflow DAGs/ RAGFlow to automate the end-to-end AI lifecycle, including data ingestion, automated evaluation, and prompt versioning.
Develop and version-control sophisticated system prompts, employing techniques like Chain-of-Thought (CoT) to improve reasoning.
Implement CAG strategies to optimize KV cache reuse and reduce compute costs for long-context project tasks.
Author and refine system prompts using Agentic techniques to ensure consistent performance across different LLM backends.

Requirement

Bachelors degree in information technology, Computer Science, Finance, or related field.
Minimum 3+ years of experience with LLMs hands-on expertise with vLLM and model quantization (AWQ/GPTQ).
Strong proficiency in Apache Airflow for scheduling complex data and AI pipelines.
Experience with RAGFlow (or similar deep-document RAG frameworks) and vector databases.
Experience to build multi-agent systems that use tools and external APIs to complete multi-step tasks.
Advanced Python, Docker, and Kubernetes
Experience with AI observability tools to track latency, cost, and hallucination rates.

EA Number: 11C4879