We are seeking a highly skilled Senior Big Data Developer / Spark Developer to design and develop scalable data processing applications in a Hadoop-based ecosystem. The role focuses on building high-performance ETL pipelines and distributed data processing solutions using Spark.
This is a hands-on engineering role where you will contribute to solution design at the module and pipeline level while working closely with architects and data teams.
Key Responsibilities:
- Design and develop scalable Spark-based data processing applications using PySpark / Scala / Java.
- Build and maintain ETL pipelines for structured and semi-structured data.
- Design data transformation logic and processing workflows based on business requirements.
- Implement batch and real-time data ingestion pipelines into data lake and data mart environments.
- Optimize Spark jobs for performance, memory utilization, and execution efficiency.
- Develop SQL queries for data validation, reconciliation, and reporting.
- Debug production issues and resolve data pipeline failures.
- Collaborate with architects, data engineers, and analytics teams to deliver data solutions.
- Follow coding standards, testing practices, and CI/CD deployment processes.
Required Skills & Experience:
- 6-10 years of experience in Big Data application development.
- Strong hands-on development experience with Apache Spark (PySpark / Scala /Java).
- Strong programming skills in Python, Java, and Scala.
- Solid experience with Hadoop ecosystem: HDFS, Hive, Impala, YARN, Sqoop,Oozie.
- Strong SQL skills for data processing and validation.
- Experience building large-scale batch data pipelines.
- Good understanding of data lake and data warehouse concepts.
- Experience working in Linux environments with shell scripting.
- Knowledge of job scheduling using cron or workflow orchestration tools.
Good to Have:
- Experience with Apache Airflow, NiFi, or similar orchestration tools.
- Exposure to Kafka or real-time streaming frameworks.
- Experience with cloud big data platforms (AWS EMR, Azure HDInsight, GCPDataproc).
- Familiarity with Docker and Kubernetes.
- Knowledge of CI/CD pipelines for Spark jobs.
Role Type:
- Senior Developer / Individual Contributor
- Hands-on coding role
- Design responsibility limited to modules, pipelines, and data workflows
- No people management
What We Offer:
- Opportunity to work on large-scale enterprise data platforms
- Exposure to cutting-edge big data and cloud technologies
- Competitive salary and benefits
- Strong engineering culture and learning environment