- Experience: 3+ years of professional experience in Data Engineering roles, with at least 2 years focused on cloud-native data services.
- Programming Expertise: Expert proficiency in one coding language like Python , Java or .NET.
- Data Fundamentals: SQL Mastery: Solid expertise in writing complex and highly optimized
SQL queries for relational databases and data warehouses.
- Data Modeling: Deep understanding of data structures, data modeling (e.g., dimensional modeling), and data access patterns.
- Diverse Data Stores: Experience working with a variety of databases, including relational (e.g., PostgreSQL, MySQL) , NoSQL (e.g., DynamoDB, CosmosDB) , and distributed file systems.
- Cloud Proficiency (Practical Tooling): Proven hands-on experience in at least one major cloud platform, utilizing services critical to data engineering.
- AWS Examples: S3, RDS/Aurora, EMR, Glue, Athena, Redshift, Lambda.
- Azure Examples: Data Lake Storage, Azure SQL, CosmosDB, Azure Data Factory, Synapse.
- Pipeline & Processing: Distributed Processing: Extensive experience with Big Data/distributed data processing frameworks like Apache Spark (PySpark) or Hadoop. ETL/ELT Frameworks: Strong experience building and maintaining data transformations using frameworks like PySpark and libraries like Pandas .
- Orchestration: Experience with modern workflow orchestration tools such as Apache Airflow or Azure Data Factory .
- DevOps & Governance: Automation: Familiarity with building and using CI/CD pipelines for automated deployment.
- Infrastructure as Code (IaC): Experience with DevOps tools such as Git, Docker, and Terraform
System Design: Understanding of system design principles and experience in architecting robust, scalable, and secure data systems.