
Search by job, company or skills

Data Pipeline Development & Operations
• Design, build, and operate scalable and reliable data pipelines on the Databricks platform
• Develop end-to-end data workflows from ingestion through transformation to consumption
• Implement robust error handling, monitoring, and alerting mechanisms
• Ensure data pipeline reliability, performance, and maintainability
• Optimize pipeline performance through efficient Spark job design and cluster configuration
• Manage and orchestrate complex data workflows using Databricks Jobs and workflows
Legacy Code Modernization
• Refactor legacy code and data pipelines to PySpark for improved performance and scalability
• Migrate traditional ETL processes to modern ELT patterns on Databricks
• Assess existing codebases and identify opportunities for optimization and modernization
• Ensure backward compatibility and data integrity during migration processes
• Document refactoring approaches and create migration playbooks
• Collaborate with stakeholders to minimize disruption during code transitions
Data Engineering Excellence
• Implement data quality checks and validation frameworks
• Design and maintain Delta Lake tables with appropriate optimization strategies
• Develop reusable code libraries and frameworks for common data engineering tasks
• Follow software engineering best practices including version control, testing, and CI/CD
• Participate in code reviews and provide constructive feedback to team members
• Troubleshoot and resolve data pipeline issues in production environments
Collaboration & Knowledge Sharing
• Work closely with data architects, analysts, and business stakeholders
• Collaborate with Infrastructure (Infra), Applications (Apps), and Cyber teams
• Share knowledge and best practices with Team NCS
• Mentor junior data engineers on PySpark and Databricks technologies
• Document technical solutions and maintain comprehensive documentation
Skillset-(Must have)
Essential Technical Skills
• Data Engineering: Strong foundation in data engineering principles, ETL/ELT processes, and data pipeline design patterns
• PySpark: Proven hands-on experience developing data pipelines using PySpark, including DataFrames API, Spark SQL, and performance optimization
• Databricks Platform: Practical experience with Databricks workspace, cluster management, notebooks, and job orchestration
• Workspace AI Agent: Knowledge of Databricks Workspace AI Agent capabilities and integration
• Data Modelling: Experience implementing data models including dimensional modeling, data vault, or lakehouse architectures
• Delta Lake: Understanding of Delta Lake features including ACID transactions, schema evolution, and optimization techniques
• Python: Strong Python programming skills for data processing and automation
Additional Technical Skills
• SQL proficiency for data querying and transformation
• Experience with cloud platforms (Azure, AWS, or GCP)
• Understanding of data governance and security best practices
• Knowledge of streaming data processing (Structured Streaming)
• Familiarity with DevOps practices and CI/CD pipelines
• Experience with version control systems (Git)
• Understanding of data quality frameworks and testing methodologies
Professional Experience
• Minimum 8 years in data engineering or related roles
• At least 2-3 years of hands-on experience with Databricks platform
• Proven track record of refactoring legacy code to modern frameworks
• Experience building and maintaining production data pipelines at scale
• Background working across multiple data sources and formats
• Experience in agile development environments
Required Certifications - mandatory to have at least one certification
• Databricks Certified Data Engineer Associate OR Databricks Certified Data Engineer Professional
Additional Certifications (Preferred)
• Databricks Certified Associate Developer for Apache Spark
• Cloud platform certifications (Azure Data Engineer Associate, AWS Certified Data Analytics, or Google Cloud Professional Data Engineer)
• Relevant data engineering or big data certifications
Soft Skills
• Strong problem-solving and analytical thinking abilities
• Excellent communication skills to explain technical concepts clearly
• Ability to work collaboratively in cross-functional teams
• Self-motivated with strong attention to detail
• Adaptable to changing priorities and technologies
• Client-focused mindset with commitment to quality delivery
Founded in 2013, Arient Solutions is an independent specialized recruiting & staffing firm headquartered in Tirunelveli, Tamil Nadu. We chip in as your HR partner in providing an array of HR related services. Our success is forged upon our personalized, long-term relationships with both our clients and candidates together with an underlying knowledge of the sectors we operate in. We are now a leading Human Resource Employment Services Company with proven track record in recruiting candidates for a wide range of industries and job roles.
Job ID: 145442255
Skills:
Pyspark, Python, Azure Databricks
Skills:
, Sql, Python, Data Modelling, Pyspark, Data governance and security best practices, Databricks Platform, Data quality frameworks and testing methodologies, Workspace AI Agent, Delta Lake
Skills:
data engineering , Data Modelling, Pyspark, Sql, ELT, Devops, Git, Gcp, Databricks, Data Governance, Azure, Python, AWS, Etl, Data Quality Frameworks, Streaming Data Processing, Delta Lake
Skills:
data vault , Pyspark, Python, AWS, Spark SQL, Data Modelling, Sql, Git, Gcp, Databricks, Azure, Etl, Data pipeline design patterns, Performance optimization, Streaming data processing, ELT processes, Optimization Techniques, Structured Streaming, DevOps practices, Schema evolution, Lakehouse architectures, Workspace AI Agent, Dimensional modelling, Data Engineering principles, DataFrames API, Delta Lake, ACID transactions
Skills:
Pyspark, Apache Spark, Kafka, ELT, Git, Azure Data Factory, Databricks, Azure, Azure DevOps, Etl, Databricks Structured Streaming, Spark performance tuning, Parquet format, Great Expectations, Azure Key Vault, MLflow, Unity Catalog, Data Lakehouse, Delta Lake, Azure Event Hubs
We don’t charge any money for job offers