Search by job, company or skills

I

Data Engineer

3-5 Years
SGD 6,500 - 10,000 per month
new job description bg glownew job description bg glownew job description bg svg
  • Posted 8 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

For asuccessful POC, the candidate should ideally be a Mid-to-Senior level DataEngineer (3-5+ years) with the following must-haves:

TechnicalCore

  • Databricks Mastery: Expert-level knowledge of Delta Lake and the Medallion Architecture (Bronze/Silver/Gold layers).
  • Apache Spark (PySpark/SQL): Ability to write optimized Spark code. For coming POC, Python is usually preferred over Scala/R for its flexibility and ecosystem.
  • AWS Infrastructure: Deep understanding of S3 (Bucket policies/storage), IAM (Roles/Policies) for secure Databricks access, and VPC/Networking (Good to have)
  • Data Ingestion: Experience with Databricks Autoloader or Unity Catalog for managed data governance.

POC-Specific'Skills

  • Prototyping Speed: The ability to set up a working end-to-end pipeline (Source → S3 → Databricks → OOTB BI Tool) in weeks, not months.
  • Cost Management: Knowledge of how to configure Databricks Clusters (Autoscaling, Spot Instances) to prevent the POC from blowing your AWS budget.

JobDescriptions

Focus: Hands-on ETL/ELT and connectingvarious data sources and setup the platform with technical leadership.

  • Role Summary:
  • We are seeking a hands-on Data Engineer to spearhead our Databricks POC on AWS. You will be responsible for the initial environment setup, security configuration, and designing the framework for our future data platform.
  • You will connect diverse AWS and external data sources into a unified Databricks environment.
  • Key Responsibilities:
  • Configure Databricks workspace integration with AWS (S3, IAM, VPC).
  • Cleanse and transform raw data from S3, RDS, and APIs into Delta tables.
  • Design and implement a scalable Medallion Architecture using Delta Lake.
  • Build automated ingestion pipelines using Databricks Autoloader.
  • Optimize Spark jobs for performance and reliability.
  • Establish data governance standards using Unity Catalog. (Good to have)
  • Evaluate POC success metrics (performance, cost, ease of use).
  • Requirements: 3-5+ years in Data Engineering with PySpark/SQL strong experience with AWS Glue or EMR is a plus. Databricks Certified Data Engineer Professional preferred.

More Info

Job Type:
Industry:
Employment Type:

Job ID: 145559979

Similar Jobs