Databricks Data Engineer

base camp digital pte. ltd.

Singapore

3-5 Years

SGD 6,000 - 9,100 per month

Save

Posted 2 days ago
Be among the first 10 applicants

Early Applicant

Job Description

Role Summary:

We are seeking a hands-on Data Engineer to spearhead our Databricks Proof of Concept (POC) on AWS. You will be responsible for the initial environment setup, security configuration, and designing the framework for our future data platform. Working across diverse AWS and external data sources, you will build a unified, production-ready Databricks environment that validates the platform for broader adoption.

Key Responsibilities:

Platform & Infrastructure Setup

Configure Databricks workspace integration with AWS (S3, IAM, VPC).
Design and implement a scalable Medallion Architecture using Delta Lake (Bronze / Silver /Gold layers).
Build automated ingestion pipelines leveraging Databricks Autoloader.

Data Engineering

Cleanse and transform raw data from S3, RDS, and APIs into Delta tables.
Optimize Spark jobs for performance, reliability, and cost-efficiency.
Connect diverse AWS and external data sources into a unified Databricks environment.

Governance &Evaluation

Establish data governance standards using Unity Catalog. (Good to have - not a hard requirement for the POC.)
Configure Databricks Clusters with Autoscaling and Spot Instances to manage AWS spend.
Define and evaluate POC success metrics covering performance, cost, and ease of use.
Set up an end-to-end pipeline (Source → S3 → Databricks → BI Tool) within agreed timelines.

Qualifications and Experience:

Must-Have Skills

Databricks Mastery - Expert knowledge of Delta Lake and Medallion Architecture (Bronze /Silver / Gold).
PySpark / Spark SQL - Ability to write and optimize Spark code Python strongly preferred over Scala or R.
AWS Infrastructure - Deep understanding of S3 (bucket policies, storage), IAM(roles and policies) for secure Databricks access.
Data Ingestion -Hands-on experience with Databricks Autoloader or Unity Catalog for managed data governance.
3-5+ years of professional experience in Data Engineering with PySpark / SQL.

Good to Have