
Search by job, company or skills
About the Role
This role sits within the Audience Behaviour team, which is responsible for collecting and
aggregating clickstream data to help the organization better understand user engagement across our digital platforms. We process and manage between 1.6 to 2 billion events per month, capturing insights such as what users read, click, and interact with on our websites.
We're looking for a Data Engineer who is not only technically strong but also motivated by real-world impactsomeone who is curious about how users behave and is eager to build data systems that help product and business teams make better decisions. You will play a key role in maintaining and evolving our real-time data infrastructure and in delivering reliable, actionable data to a wide range of stakeholders.
Key Responsibilities
Design, develop, and maintain real-time and batch data pipelines using Apache Flink, Kafka, and related technologies.
Manage data storage and querying infrastructure using Apache Paimon, Apache Iceberg, Amazon S3, and Athena.
Support event-driven architectures using AWS EventBridge for real-time event transmission across systems.
Ensure the scalability, reliability, and performance of our data infrastructure.
Collaborate with engineers, analysts, and stakeholders to understand data requirements and deliver robust solutions.
Implement and promote engineering best practices, including code reviews, testing, and documentation.
Requirements
Proficiency in programming languages such as Python or Java/Scala.
Hands-on experience with real-time streaming frameworks (e.g., Kafka, Flink).
Familiarity with cloud data lake architecture and tools (e.g., Iceberg, Paimon, S3, Athena).
Experience working with AWS services, particularly EventBridge and Lambda.
Solid understanding of data modeling, schema evolution, and data governance.
Comfortable working in production environments with large-scale data.
Preferred Qualifications
Experience building and operating event-driven systems at scale.
Familiarity with distributed systems and cloud-native data infrastructure.
Exposure to CI/CD pipelines and automated deployment workflows.
Experience working in Agile teams and collaborating with cross-functional stakeholders.
Good to Have
Experience with infrastructure-as-code tools such as Terraform.
Familiarity with observability practices and tooling (e.g., metrics, logs, alerts).
Knowledge of containerization and orchestration tools such as Docker and Kubernetes.
Understanding of data privacy, access controls, and security best practices in data systems
Job ID: 137008273