
Search by job, company or skills
You will design, develop, and deploy AI-powered, cloud-based products. As a Data Engineer, youll work with large-scale, heterogeneous datasets and hybrid cloud architectures to support analytics and AI solutions. Collaborate with data scientists, infra engineers, sales specialists, and stakeholders to ensure data quality, build scalable pipelines, and optimize performance. Your work will integrate telco data with other verticals (retail, healthcare), automate DataOps/MLOps/LLMOps workflows, and deliver production-grade systems.
As a Data Engineer, you will:
Qualifications
Bachelors or Masters in Computer Science, Software Engineering, Data Science, or equivalent experience
4+ years in data engineering, analytics, or related AI/ML role
Proficient in Python for ETL/data engineering and Spark (PySpark) for large-scale pipelines
Experience with Big Data frameworks and SQL engines (Spark SQL, Redshift, PostgreSQL) for data marts and analytics
Hands-on with Airflow (or equivalent) to orchestrate ETL workflows and GitLab CI/CD or Jenkins for pipeline automation
Familiar with relational (PostgreSQL, Redshift) and NoSQL (MongoDB) stores: data modeling, indexing, partitioning, and schema evolution
Proven ability to implement scalable storage solutions: tables, indexes, partitions, materialized views, columnar encodings
Skilled in query optimization: execution plans, sort/distribution keys, vacuum maintenance, and cost-optimization strategies (cluster resizing, Spectrum)
Experience with cloud platforms (AWS): S3/EMR/Glue, Redshift and containerization (Docker, Kubernetes)
Infrastructure as Code using Terraform or CloudFormation for provisioning and drift detection
Knowledge of MLOps/LLMOps: auto-scaling ML systems, model registry management, and CI/CD for model deployment
Strong problem-solving, attention to detail, and the ability to collaborate with cross-functional teams
Nice to Have
Exposure to serverless architectures (AWS Lambda) for event-driven pipelines
Familiarity with vector databases, data mesh, or lakehouse architectures
Experience using BI/visualization tools (Tableau, QuickSight, Grafana) for data quality dashboards
Hands-on with data quality frameworks (Deequ) or LLM-based data applications (NL-->SQL generation)
Participation in GenAI POCs (RAG pipelines, Agentic AI demos, geomobility analytics)
Client-facing or stakeholder-management experience in data-driven/AI projects
StarHub Limited, most commonly known as just Starhub, is a Singaporean multinational telecommunications conglomerate and one of the major telcos operating in the country. Founded in 1998, it is listed on the Singapore Exchange (SGX).
Job ID: 146542665