Senior Associate/Assistant Vice President, AI Data Engineer

Temasek

Singapore

4-8 Years

Save

Posted 19 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

Temasek is a global investment company headquartered in Singapore, with a net portfolio value of S$434 billion (US$324 billion, €299 billion, £250 billion, and RMB2.35 trillion) as at 31 March 2025. Marking our unlisted assets to market would provide S$35 billion of value uplift and bring our mark to market net portfolio value to S$469 billion.

Our Purpose So Every Generation Prospers guides us to make a difference for today's and future generations.

Operating on commercial principles, we seek to deliver sustainable returns over the long term.

We have 13 offices in 9 countries around the world: Beijing, Hanoi, Mumbai, Shanghai, Shenzhen, and Singapore in Asia; and Brussels, London, Mexico City, New York, Paris, San Francisco, and Washington, DC outside Asia. 

For more information on Temasek, please visit www.temasek.com.sg.
For Temasek Review 2025, please visit www.temasekreview.com.sg.
For Sustainability Report 2025, please visit https://www.temasek.com.sg/content/dam/temasek-corporate/sustainability/2025/Temasek-Sustainability-Report-2025.pdf.

Introduction

AI agents are only as good as the data they can reason over. Poorly structured, stale, or inconsistently governed data is the most common reason enterprise AI products fail to deliver value — not model capability, but data readiness. The AI Data Engineer at Temasek is responsible for building the data foundations that make Temasek's agentic AI systems trustworthy, accurate, and capable of reasoning over the complex, heterogeneous data environment of a global investment institution.

This role sits at the intersection of data engineering and AI systems engineering — responsible for designing and building the data architectures, pipelines, and quality frameworks that allow AI agents to retrieve, reason over, and act on Temasek's investment data. You will work across structured investment data (portfolio positions, financial statements, market data), unstructured data (research reports, company filings, meeting notes, news), and real-time data streams — making all of it accessible, reliable, and AI-readable.

Responsibilities

Agent-ready data architecture

Design and build data architectures optimised for AI agent consumption, including structured stores exposed via APIs, vector databases for semantic retrieval, graph databases for relationship reasoning, and hybrid retrieval systems combining keyword, semantic, and structured queries.
Own the data layer for RAG pipelines: document ingestion workflows, chunking strategies, embedding generation and refresh, metadata tagging, and vector index management across domains (e.g., company research, market intelligence, portfolio data, regulatory filings).
Establish ontology and schema standards to ensure AI-accessible data is consistent, well-documented, and interpretable without custom parsing logic.
Architect real-time and near-real-time data feeds (e.g., market data, news, portfolio events), defining and enforcing latency and freshness SLAs.

Enterprise data quality and governance

Define and implement data quality standards (completeness, consistency, freshness, anomaly detection) with automated quality gates to prevent degraded data entering AI systems.
Build end-to-end data lineage tracking across AI pipelines, enabling traceability from source to AI consumption for debugging and audit requirements.
Partner with AI Security & Governance and enterprise data teams to ensure compliance with data classification, access control, and cross-border handling requirements (including China-related workflows).
Design and operate data observability tooling covering pipeline health, data drift, schema changes, and SLA monitoring, giving product teams visibility into data reliability.
Run regular data quality reviews with AI product teams to identify gaps impacting performance and prioritise data engineering investments.

Shared, reusable data platform for AI

Develop reusable data assets and services supporting multiple AI products, including a shared investment knowledge graph, company/market data APIs, document intelligence pipelines, and portfolio analytics services.
Maintain a data catalogue documenting sources, schemas, freshness, quality metrics, access protocols, and limitations to enable informed data usage.
Contribute to enterprise data platform strategy with an AI-first perspective, ensuring architectures support AI consumption patterns beyond traditional BI/reporting needs.
Engage external data vendors to evaluate, onboard, and maintain high-quality data sources, including ongoing quality assessment and licensing management with procurement.

Requirements

Experience and background

4–8 years of data engineering experience with at least 2 years specifically focused on building data infrastructure for AI/ML or LLM-powered systems in production.
Demonstrated experience at a data-intensive organisation with complex, heterogeneous data environments — financial data, enterprise data platforms, or equivalent — ideally with exposure to investment data domains (company financials, market data, portfolio systems).
Hands-on experience building RAG pipelines or AI knowledge bases in production, including vector store management, embedding pipeline design, and chunking strategy optimisation.
Strong data engineering fundamentals: pipeline design and orchestration, schema design, data quality frameworks, and lineage tracking — with the rigour expected of a Palantir-calibre data engineering background.

Technical capabilities

Data pipeline and orchestration: Python (pandas, Polars, SQLAlchemy), dbt, Apache Airflow or Prefect, Spark for large-scale processing; experience with both batch and streaming pipeline architectures (Kafka, Kinesis, or equivalent).
AI data stack: vector databases (Pinecone, Weaviate, pgvector, Chroma), embedding models and management, LlamaIndex or LangChain data connectors, document parsing and OCR tooling, and chunking strategy design for different document types.
Structured data and analytics: SQL proficiency across multiple dialects, experience with enterprise data warehouse platforms (Snowflake, BigQuery, Redshift, or Databricks), and familiarity with graph database concepts (Neo4j or equivalent).
Data quality and observability: experience with data quality frameworks (Great Expectations, Soda, or equivalent), data lineage tools (OpenLineage, DataHub, Marquez), and data observability platforms (Monte Carlo, Acceldata, or equivalent).

More Info

Job Type:

Industry:

Function:

Employment Type:

About Company

TemasekJob Source: www.linkedin.com

Job ID: 150688315

Jobs by Skill - IT

Jobs by Skill - Non IT

International Jobs

Jobs in Top Cities

Popular Jobs

Last Updated: 03-07-2026 11:28:51 PM

Homejobs in SingaporeSenior Associate/Assistant Vice President, AI Data Engineer

Similar Jobs

AI & Data Engineer | Pasir Panjang

business edge personnel services pte ltd

1-4 yrs

SGD 4,500 - 7,500 per month

Singapore

Skills:

data engineering , Ml, Java, Machine Learning, Natural Language Processing, SAS, data mining, Scala, Big Data, Python, analytics platforms, Ai, data technologies, data lakes, R, ETL processes, cloud computing platforms

Senior AI Data Engineer (Data Lake & GenAI platform)

Randstad Singapore

5-7 yrs

Singapore

Skills:

S3, Lambda, AWS Glue, Redshift, AWS SageMaker, RAG, Flask APIs

Senior AI Data Engineer (Data Lake & GenAI platform)

randstad pte. limited

5-7 yrs

SGD 7,000 - 12,000 per month

Singapore, Cross Street

Skills:

S3, Lambda, AWS Glue, Redshift, AWS SageMaker, Flask APIs

AI Data Engineer & Analyst

ruder finn asia pte ltd

5-7 yrs

SGD 5,500 - 7,500 per month

Singapore, Harbourfront

Skills:

Spark SQL, T-sql, Data Factory, Power Bi, Json, Sql, Pandas, Azure Machine Learning, Python, Parquet, OneLake, Microsoft Entra ID, scikit-learn, Lakehouse Data Warehouse, Azure AI Services, Microsoft Purview, Delta Lake, Microsoft Fabric

Senior AI Data Engineer (Video AI)

dada consultants

3-7 yrs

Singapore

Skills:

data engineering , snowflake , Google Cloud Platform, Ffmpeg, Kafka, Hive, Opencv, Spark, Microsoft Azure, AWS, AI data infrastructure, Airflow, big data platforms, cloud platforms, Ray, Flink, video processing technologies, distributed data processing frameworks