We are seeking a skilled Data Engineer to design, build, and optimize scalable data pipelines using Databricks (Apache Spark/PySpark) and Snowflake.
This role will focus on developing end-to-end data solutions, where Databricks is used for data ingestion and transformation, and Snowflake serves as the enterprise data warehouse for analytics and reporting.
Key Responsibilities
- Design, develop, and maintain ETL/ELT pipelines using Databricks (PySpark/Spark)
- Build scalable data ingestion frameworks from multiple sources (APIs, databases, flat files)
- Perform data transformation, cleansing, and aggregation for downstream analytics
- Load, manage, and optimize datasets in Snowflake data warehouse
- Develop and maintain data models (e.g., star schema, dimensional models)
- Monitor and improve data pipeline performance, reliability, and cost efficiency
- Implement data quality checks, validation rules, and monitoring processes
- Collaborate with business stakeholders, analysts, and BI teams to deliver data solutions
- Support cloud-based data platforms on AWS or Azure
Requirements
- Strong hands-on experience with Databricks (Apache Spark / PySpark)
- Proficiency in SQL and data transformation techniques
- Experience with Snowflake (data loading, querying, performance tuning)
- Experience working on cloud platforms (AWS or Azure)
- Solid understanding of data engineering concepts (ETL, data pipelines, data lakes)
Preferred Skills (Good to Have)
- Experience with Delta Lake / Lakehouse architecture (Bronze, Silver, Gold layers)
- Familiarity with orchestration tools (e.g., Airflow, Azure Data Factory)
- Exposure to dbt or similar transformation tools
- Experience with streaming data pipelines
- Knowledge of CI/CD and DevOps practices
Qualifications & Experience
- Bachelor's degree in Computer Science, Engineering, or related field
- Minimum 4 to 7 years of relevant experience in Data Engineering
- Proven experience delivering end-to-end data pipeline solutions
- Strong analytical and problem-solving skills