Design and implement scalable data pipelines using GCP services (BigQuery, Cloud Composer, etc.)
Build and maintain the foundational data platform and infrastructure to support data ingestion, analytics, reporting, and machine learning use cases
Collaborate with data analysts, scientists, and business stakeholders to understand data needs and deliver high-quality solutions
Ensure data quality, reliability, and governance through robust monitoring, testing, and documentation
Optimize performance and cost-efficiency of data workflows and storage
Contribute to architecture decisions and help define best practices for data engineering within the organization
Job Requirements:
Hands-on experience in data engineering, with experience in building production-grade data pipelines
Proficiency in GCP and its data ecosystem (BigQuery, Dataflow, Cloud Storage, etc.) is preferred, but other public clouds experience such as AWS and Azure are acceptable.
Strong programming skills in Python, SQL, and experience with orchestration tools like Airflow or Cloud Composer
Experience with batch data processing, developing data workflows from multiple sources of data
Familiarity with CI/CD practices, infrastructure as code (Terraform), and version control (Git)
A mindset for automation, scalability, and reliability
Someone who is hungry, self-motivated, and not afraid to take ownership
Excellent communication skills and a collaborative attitude
Experience with data modeling and building data marts or warehouses
Exposure to machine learning pipelines or MLOps
Knowledge of data governance frameworks and security best practices
Experience with reporting or visualization tools such as Qlik (preferred), Tableau, etc