
Search by job, company or skills
Job Description
. Data Engineering: Strong foundation in data engineering principles, ETL/ELT processes, and data pipeline design patterns
. PySpark: Proven hands-on experience developing data pipelines using PySpark, including DataFrames API, Spark SQL, and performance optimization
. Databricks Platform: Practical experience with Databricks workspace, cluster management, notebooks, and job orchestration
. Workspace AI Agent: Knowledge of Databricks Workspace AI Agent capabilities and integration
. Data Modelling: Experience implementing data models including dimensional modelling, data vault, or Lakehouse architectures
. Delta Lake: Understanding of Delta Lake features including ACID transactions, schema evolution, and optimization techniques
. Python: Strong Python programming skills for data processing and automation
. SQL proficiency for data querying and transformation
. Experience with cloud platforms (Azure, AWS, or GCP)
. Knowledge of streaming data processing (Structured Streaming)
. Familiarity with DevOps practices and CI/CD pipelines
. Experience with version control systems (Git)
Requirements
. Must have experience in data engineering or related roles
. Hands-on experience with Databricks platform
. Proven track record of refactoring legacy code to modern frameworks
. Experience building and maintaining production data pipelines at scale
. Background working across multiple data sources and formats
. Experience in agile development environments
. Databricks Certified Data Engineer Associate OR Databricks Certified Data Engineer Professional
Additional Certifications (Preferred)
. Databricks Certified Associate Developer for Apache Spark
. Cloud platform certifications (Azure Data Engineer Associate, AWS Certified Data Analytics, or Google Cloud Professional Data Engineer)
. Relevant data engineering or big data certifications
Job ID: 144616583