Job Title: Data Engineer
Experience: 7+ years
Job Description:
We are seeking an experienced Data Engineer with strong expertise in AWS-native implementations and ETL processes. The ideal candidate will have hands-on experience in designing, developing, and automating data pipelines in a Databricks environment, leveraging Python, PySpark, and SQL. Experience in government sector data projects and compliance requirements is highly desirable.
Key Responsibilities:
- Design, develop, and maintain ETL pipelines using Databricks notebooks with PySpark, Python, and SQL for automation of data workflows.
- Perform data manipulation, validation, and error handling using Python to ensure data accuracy and quality.
- Implement complex SQL queries, joins, aggregations, and other database operations within the Databricks environment.
- Lead ETL migration projects, ensuring smooth transition and minimal disruption to existing workflows.
- Work on government sector data projects, adhering to compliance and regulatory requirements.
- Collaborate with cross-functional teams to understand data requirements and implement scalable solutions.
- Optimize data pipelines for performance, scalability, and reliability on AWS.
Required Skills and Qualifications:
- 7+ years of experience as a Data Engineer with hands-on experience in AWS cloud services.
- Strong experience in ETL development and migration projects.
- Expertise in Databricks notebooks, PySpark, Python, and SQL.
- Proficiency in data validation, transformation, and error handling in ETL processes.
- Solid experience in complex SQL operations and data manipulation within Databricks.
- Experience working in government projects with an understanding of compliance requirements.
- Strong problem-solving skills and ability to work in a collaborative team environment.
Preferred Skills:
- Knowledge of data governance and security best practices.
- Experience with CI/CD pipelines for data workflows.
- Familiarity with cloud-native monitoring and logging tools (e.g., AWS CloudWatch).