Role Overview
We are seeking an experienced Senior Data Architect / Databricks Architect to lead the design and implementation of scalable lakehouse-based data architectures using the Databricks platform. The role focuses on delivering enterprise-grade data solutions, implementing Unity Catalog governance, and enabling end-to-end data lifecycle management across ingestion, processing, storage, and analytics layers.
The ideal candidate will have strong expertise in Databricks, Apache Spark, Delta Lake, and cloud data platforms, along with the ability to collaborate with the project teams to design high-performance, secure, and scalable data ecosystems.
Key Responsibilities
End-to-End Data Architecture
- Collaborate with Databricks Professional Services and project stakeholders to design comprehensive end-to-end data architectures on the Databricks platform.
- Define scalable data ingestion strategies integrating structured and unstructured data from multiple source systems.
- Architect scalable lakehouse storage solutions using Delta Lake and modern data platform best practices.
- Develop robust data processing frameworks leveraging Apache Spark and Databricks workflows.
- Design data consumption layers that support analytics, reporting, AI/ML, and operational workloads.
- Ensure seamless data movement and lifecycle management across ingestion, transformation, storage, and consumption layers.
Governance, Security & Compliance
- Implement data governance frameworks leveraging Unity Catalog for centralized governance.
- Configure metastore, catalog and schema structures, and implement access control policies.
- Design and enforce data security, role-based access control, and data protection strategies.
- Ensure compliance with regulatory requirements and enterprise data governance standards.
- Implement data lineage, monitoring, audit logging, and observability for the data platform.
- Optimize system performance through cluster configuration, workload management, and query tuning.
- Define and implement data quality frameworks and validation processes.
Data Modelling & Design
- Design business-aligned data models supporting enterprise analytics and operational use cases.
- Implement dimensional modeling, normalized models, and data vault architectures.
- Design optimized Delta table structures to improve scalability and query performance.
- Implement medallion architecture (Bronze, Silver, Gold layers) for structured data refinement.
- Develop data schemas that support both BI analytics and machine learning workloads.
- Maintain data dictionaries, metadata documentation, and model specifications.
Technical Leadership & Collaboration
- Lead technical workshops with the project team, stakeholders, and cross-functional teams to gather and refine requirements.
- Provide architectural guidance and best practices for Databricks-based data engineering teams.
- Collaborate with Infrastructure, Applications, and Cybersecurity teams for integrated enterprise solutions.
- Mentor data engineers, architects, and platform specialists on modern lakehouse architectures.
- Present architecture strategies, solution designs, and technical recommendations to leadership and stakeholders.
Solution Implementation
- Lead implementation of Databricks-based solutions from architecture design to production deployment.
- Oversee proof-of-concept (POC) initiatives and pilot programs to validate technical feasibility.
- Ensure solutions meet scalability, reliability, security, and performance requirements.
- Conduct architecture reviews and governance checkpoints aligned with enterprise standards.
Required Technical Skills
Databricks & Data Platform
- Strong hands-on experience with the Databricks platform, including:
- Workspace administration
- Cluster configuration and optimization
- Workflow orchestration
Unity Catalog
- Experience implementing Unity Catalog for unified data governance, including:
- Metastore configuration
- Catalog and schema design
- Access control and policy management
Data Engineering & Architecture
- Expertise in data modeling approaches including:
- Dimensional modeling
- Data Vault
- Lakehouse architecture
- Deep knowledge of Delta Lake features, including:
- ACID transactions
- Time travel
- Performance optimization techniques
- Strong proficiency in Apache Spark (Spark SQL, DataFrames, performance tuning).
Programming
- Strong coding experience in:
- Python
- SQL
- Scala
Cloud Platforms
- Hands-on experience with at least one major cloud platform:
- Microsoft Azure
- Amazon Web Services (AWS)
- Google Cloud Platform (GCP)
Additional Technical Skills
- Data pipeline development and ETL/ELT architecture
- Metadata management and data governance frameworks
- CI/CD implementation for data platforms
- Data quality monitoring and validation frameworks
- Performance optimization and troubleshooting
- Knowledge of data security, compliance, and regulatory standards
Professional Experience
- 810+ years of experience in data architecture, data engineering, or advanced analytics roles
- 35+ years of hands-on Databricks platform experience
- Proven experience implementing Unity Catalog in enterprise-scale environments
- Demonstrated success designing large-scale enterprise data models and lakehouse architectures
- Experience working with Databricks Professional Services or partner ecosystems is highly desirable
- Experience across multiple industries such as Public Sector, Financial Services, Healthcare, or Retail is advantageous
Preferred Certifications
- Databricks Certified Associate Developer for Apache Spark
- Databricks Data Engineer Professional
- Cloud certifications such as:
- Azure Data Engineer Associate
- AWS Data Analytics Specialty
- Google Professional Data Engineer
- Other relevant data management or analytics certifications