Key Responsibilities:
Data Pipeline & Architecture
- Design, build, and optimize scalable, reliable data pipelines (batch and streaming) and ETL/ELT workflows using SQL, Python, and big data technologies.
- Lead data architecture discussions, including the design and review of ERDs, data models, and system design.
- Build and maintain transactional and analytical schemas for data lakes, warehouses, and marts.
Cloud & Infrastructure
- Implement and manage cloud-based data platforms (AWS preferred; Azure/GCP experience a plus) ensuring scalability, reliability, cost efficiency, and security.
- Deploy and manage infrastructure using Terraform and other IaC tools.
- Develop and maintain CI/CD pipelines (e.g., GitHub Actions) for deploying data applications and services.
- Apply best practices for cloud infrastructure, including cost management, security, redundancy, and performance optimization.
Data Quality, Governance & Security
- Drive data quality, governance, and compliance across product and business areas.
- Implement data encryption, hashing, and privacy protection mechanisms.
- Adhere to Master Data Management (MDM) principles and enterprise data governance policies.
Analytics & Business Impact
- Partner with product, engineering, data science, and business teams to deliver business outcomes through data-driven products.
- Deliver audience and behavioral analytics, KPIs, dashboards, and reporting.
- Propose solutions for BI dashboards and enterprise data needs.
- Support the democratization of data across the organization.
Leadership & Collaboration
- Lead a team of data engineers, taking ownership of decisions and deliveries.
- Mentor and guide engineers in best practices, architecture, and performance optimization.
- Manage stakeholder relationships, roadmaps, and expectations.
- Work with diverse stakeholders across domains including sales, marketing, advertising, engineering, and publishing.
- Follow Agile methodologies (Scrum) for project delivery.
Required Skills & Experience:
- Strong proficiency in SQL for data modeling, querying, and transformation.
- Advanced Python development skills for data engineering use cases.
- Proven experience in AWS services (S3, Glue, Lambda, RDS, Lake Formation, Athena, Kinesis, EMR, Step Functions).
- Strong expertise in Terraform for infrastructure provisioning.
- Proficiency in CI/CD tools (e.g., GitHub Actions) and Git branching strategies.
- Hands-on experience with big data technologies such as Spark, Hive, Kafka, Hudi, or Iceberg.
- Ability to design BI-ready data models (dimensional modeling) and implement BI frameworks.
- Solid understanding of data governance, data quality, and security frameworks.
- Strong communication skills to explain complex data and analytics concepts to stakeholders.
- Leadership experience, including mentoring teams and managing deliveries.
Preferred Qualifications:
- 7+ years of relevant hands-on experience in data engineering, solution architecture, or analytics roles.
- 5+ years of team leadership or people management experience.
- Experience with containerization and orchestration (Docker, Kubernetes).
- Experience with BI tools (Tableau, QuickSight, etc.) and query engines (Presto, Trino).
- Familiarity with Agile methodologies and enterprise-scale data platform implementations.