
Search by job, company or skills
Job Description
We are seeking a skilled Software Developer - PySpark to join our data engineering team. The ideal candidate will have strong hands-on experience in building, optimizing, and maintaining large-scale data processing systems. You will work as an individual contributor and collaborate closely with business users, stakeholders, and cross-functional teams to deliver high-quality data solutions.
Key Responsibilities
Design, develop, and implement scalable data processing pipelines using PySpark
Work as an individual contributor owning end-to-end development and delivery
Collaborate with business users and stakeholders to gather and understand requirements
Develop and optimize applications using Python and PySpark
Perform performance tuning and optimization of large-scale Spark workloads
Write and optimize complex SQL queries
Support data modeling and schema design activities
Debug, troubleshoot, and resolve issues in distributed systems
Follow Agile/Scrum methodologies and participate in sprint activities
Use Git for version control and contribute to CI/CD pipelines as part of DevOps practices
Required Skills & Qualifications
Minimum 5 years of professional experience as a Software Developer with strong focus on PySpark
Hands-on experience with Big Data technologies and distributed data processing
Strong proficiency in Python
Excellent working knowledge of SQL
Solid understanding of Spark concepts (RDDs, Data Frames, partitions, joins, shuffles)
Experience in Spark performance tuning and optimization
Experience using Git and CI/CD pipelines
Strong debugging, analytical, and problem-solving skills
Good communication skills and ability to work with stakeholders
Job ID: 135940933