Search by job, company or skills

T

Big Data Analyst

6-12 Years
Save
  • Posted 15 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Job Description:

Role: Big Data Engineer (PySpark & Scala)

Location: Singapore

Experience: 6 – 12 Years

Big Data Engineer (PySpark & Scala)

Role Summary

We are seeking a highly skilled Big Data Engineer with strong experience in PySpark and Scala to design, build, and optimize large-scale data processing systems. The candidate will work on distributed data platforms, enabling real-time and batch analytics solutions across enterprise data ecosystems.

The role involves building scalable data pipelines, data lakes, and streaming solutions leveraging modern big data technologies.

Key Responsibilities

Data Engineering & Development

Design and develop scalable data pipelines using PySpark and Scala

Data Platform & Architecture

Data Processing & Optimization

Streaming & Real-Time Data

Cloud & Modern Data Stack

Data Quality & Governance

Collaboration & Delivery

Big Data Ecosystem

Hadoop: HDFS, Hive, YARN

Data ingestion tools: Kafka, Sqoop

Data formats: Parquet, ORC, JSON, Avro

Programming & Query

Python / Scala

Advanced SQL (joins, aggregations, optimization)

Shell scripting / Unix basics

Streaming & Messaging

Kafka (must-have)

Event-driven architecture

Real-time data processing frameworks

DevOps & Tools

Git, CI/CD pipelines

Docker / Kubernetes (good to have)

Scheduling tools (Airflow / Control-M)

Cloud (One or more preferred)

AWS / Azure / GCP

Databricks / Snowflake exposure

Domain Experience (Preferred)

BFSI / Banking / Payments domain

Experience working with high-volume financial datasets

Knowledge of data compliance and regulatory reporting

This role requires a combination of:

Strong PySpark + Scala coding

Deep big data architecture knowledge

Hands-on experience with streaming + cloud data engineering

More Info

Job Type:
Industry:
Employment Type:

Job ID: 150612519

Similar Jobs

Singapore

Skills:

PowerbiData VisualizationTableauPythonSqlR