Roles & Responsibilities
We are looking for a Lead Big Data Engineer to spearhead the development of scalable and secure big data solutions across our organization. This is a strategic leadership role that combines deep technical expertise with a passion for mentoring, architecture, and data governance. As a key member of our data team, you will drive innovation, ensure best practices, and help shape the future of our data infrastructure.
Key Responsibilities:
- Design, build, and optimize data pipelines and ETL/ELT workflows using SQL and Python.
- Lead architecture and design reviews, including the development of Entity Relationship Diagrams (ERDs) and system blueprints.
- Work closely with analysts, data scientists, and software engineers to translate business needs into scalable data solutions.
- Provision and manage cloud infrastructure using Terraform and other Infrastructure-as-Code (IaC) tools.
- Design and maintain CI/CD pipelines, particularly leveraging GitHub Actions, for seamless deployment of data applications.
- Utilize AWS services such as S3, Glue, Lambda, RDS, and Lake Formation to build robust cloud-native data platforms.
- Handle both batch and real-time data ingestion and transformation with efficiency and scalability.
- Support strategic initiatives around data privacy, quality, and security; implement data encryption, masking, and hashing techniques.
- Enforce modern software development practices including version control, testing, and deployment standards.
- Optimize system performance and troubleshoot production data issues proactively.
- Guide and mentor junior data engineers and support team knowledge-sharing efforts.
Required Skills & Experience:
- Strong command of SQL for complex querying and data modeling.
- Advanced Python programming skills focused on data engineering applications.
- Proven experience with Terraform for cloud infrastructure deployment.
- Familiarity with CI/CD tools, especially GitHub Actions.
- Expert-level knowledge of AWS cloud services and architecture.
- Experience developing and interpreting ERDs and participating in architectural decision-making.
- Effective communication and leadership skills with prior experience managing or mentoring engineers.
Preferred Qualifications:
- Hands-on experience with big data tools such as Apache Spark, Hive, or Kafka.
- Familiarity with Docker and Kubernetes for containerization and orchestration.
- Knowledge of data governance frameworks and compliance standards (e.g., GDPR, HIPAA).
- Prior experience implementing enterprise-scale data quality and security initiatives.