Search by job, company or skills

N

Data Engineer

8-11 Years
SGD 9,000 - 12,000 per month
new job description bg glownew job description bg glownew job description bg svg
  • Posted 22 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Key Responsibilities:

  • Design and develop scalable data pipelines across Hadoop (Hive, Impala, Spark, Kafka, Iceberg) and Teradata environments.
  • Build ingestion and transformation frameworks using Java, Spark, Python and shell scripts.
  • Develop full stack applications and internal tools using Python, Shell scripting, and modern web frameworks (e.g., Flask, React).
  • Create APIs and microservices to expose data and ML models securely to downstream systems and user interfaces.
  • Collaborate with data scientists to operationalize ML models using Cloudera Machine Learning (CML)
  • Build and deploy GenAI/LLM-powered applications for intelligent data interaction, summarization, and automation.
  • Implement enterprise-grade security controls including RBAC, LDAP, Kerberos, Apache Ranger, and row-level access.
  • Tune and optimize data applications for performance across Hadoop and Teradata, ensuring efficient resource utilization.
  • Support sandbox environments for prototyping, enabling users to build ML models, dashboards, and data pipelines.

Required Skills & Experience:

  • Data Engineering: Strong experience with Hadoop ecosystem (Hive, Impala, Spark, Kafka, Iceberg, Ranger, Atlas), Teradata and data pipeline orchestration.
  • Full Stack Development: Proficiency in Python, Shell scripting, REST APIs, and web frameworks (Flask, React, etc.).
  • Machine Learning & AI: Hands-on experience with ML platforms (CML), Spark MLlib, Python ML libraries (scikit-learn, XGBoost), and model deployment.
  • GenAI/LLM Applications: Familiarity with building applications using large language models (e.g., OpenAI, Hugging Face, LangChain) for enterprise use cases.
  • Security & Governance: Experience with enterprise data security (LDAP, Kerberos, RBAC), data masking, and access control.
  • Performance Tuning: Proven ability to optimize data applications and queries in large-scale environments (Hadoop, Teradata).
  • Tools & Platforms: Cloudera Data Platform (CDP), Informatica, QlikSense, Apache Oozie, Git, CI/CD pipelines.
  • Soft Skills: Strong analytical and problem-solving skills, excellent communication, and ability to work in cross-functional teams.

More Info

Job Type:
Industry:
Employment Type:

Job ID: 144998433

Similar Jobs