Our Purpose
Mastercard powers economies and empowers people in 200+ countries and territories worldwide. Together with our customers, we're helping build a sustainable economy where everyone can prosper. We support a wide range of digital payments choices, making transactions secure, simple, smart and accessible. Our technology and innovation, partnerships and networks combine to deliver a unique set of products and services that help people, businesses and governments realize their greatest potential.
Title And Summary
Senior / Lead Data Engineer
We are seeking great talents for our roles - Lead Data Engineer & Senior Data Engineer to join Mastercard Foundry R&D. You will help shape our innovation roadmap by exploring new technologies and building scalable, datadriven prototypes and products. The ideal candidate is handson, curious, adaptable, and motivated to experiment and learn.
Lead Data Engineer
What You'll Do
- Drive Data Architecture: Own the data architecture and modeling strategy for AI projects. Define how data is stored, organized, and accessed. Select technologies, design schemas/formats, and ensure systems support scalable AI and analytics workloads.
- Build Scalable Data Pipelines: Lead development of robust ETL/ELT workflows and data models. Build pipelines that move large datasets with high reliability and low latency to support training and inference for AI and generative AI systems.
- Ensure Data Quality & Governance: Oversee data governance and compliance with internal standards and regulations. Implement data anonymization, quality checks, lineage, and controls for handling sensitive information.
- Provide Technical Leadership: Offer handson leadership across data engineering projects. Conduct code reviews, enforce best practices, and promote clean, welltested code. Introduce improvements in development processes and tooling.
- CrossFunctional Collaboration: Work closely with engineers, scientists, and product stakeholders. Scope work, manage data deliverables in agile sprints, and ensure timely delivery of data components aligned with project milestones.
What You'll Bring
- Extensive Data Engineering Experience: 812+ years in data engineering or backend engineering, including senior/lead roles. Experience designing endtoend data systems, solving scale/performance challenges, integrating diverse sources, and operating pipelines in production.
- Big Data & Cloud Expertise: Strong skills in Python and/or Java/Scala. Deep experience with Spark, Hadoop, Hive/Impala, and Airflow. Handson work with AWS, Azure, or GCP using cloudnative processing and storage services (e.g., S3, Glue, EMR, Data Factory). Ability to design scalable, costefficient workloads for experimental and variable R&D environments.
- AI/ML Data Lifecycle Knowledge: Understanding of data needs for machine learningdataset preparation, feature/label management, and supporting realtime or batch training pipelines. Experience with feature stores or streaming data is useful.
- Leadership & Mentorship: Ability to translate ambiguous goals into clear plans, guide engineers, and lead technical execution.
- ProblemSolving Mindset: Approach issues systematically, using analysis and data to select scalable, maintainable solutions.
Required Skills
- Education & Background: Bachelor's degree in Computer Science, Engineering, or related field. 8-12+ years of proven experience architecting and operating productiongrade data systems, especially those supporting analytics or ML workloads.
- Pipeline Development: Expert in ETL/ELT design and implementation, working with diverse data sources, transformations, and targets. Strong experience scheduling and orchestrating pipelines using Airflow or similar tools.
- Programming & Databases: Advanced Python and/or Scala/Java skills and strong software engineering fundamentals (version control, CI, code reviews). Excellent SQL abilities, including performance tuning on large datasets.
- Big Data Technologies: Handson Spark experience (RDDs/DataFrames, optimization). Familiar with Hadoop components (HDFS, YARN), Hive/Impala, and streaming systems like Kafka or Kinesis.
- Cloud Infrastructure: Experience deploying data systems on AWS/Azure/GCP. Familiar with cloud data lakes, warehouses (Redshift, BigQuery, Snowflake), and cloudbased processing engines (EMR, Dataproc, Glue, Synapse). Comfortable with Linux and shell scripting.
- Data Governance & Security: Knowledge of data privacy regulations, PII handling, access controls, encryption/masking, and data quality validation. Experience with metadata management or data cataloging tools is a plus.
- Collaboration & Agile Delivery: Strong communication skills and experience working with crossfunctional teams. Ability to document designs clearly and deliver iteratively using agile practices.
Preferred Skills
- Advanced Cloud & Data Platform Expertise: Experience with AWS data engineering services, Databricks, and Lakehouse/Delta Lake architectures (including bronze/silver/gold layers).
- Modern Data Stack: Familiarity with dbt, Great Expectations, containerization (Docker/Kubernetes), and monitoring tools like Grafana or cloudnative monitoring.
- DevOps & CI/CD for Data: Experience implementing CI/CD pipelines for data workflows and using IaC tools like Terraform or CloudFormation. Knowledge of data versioning (e.g., Delta Lake timetravel) and supporting continuous delivery for ML systems.
- Continuous Learning: Motivation to explore emerging technologies, especially in AI and generative AI data workflows.
Senior Data Engineer
What You'll Do
- Drive Data Architecture: Own the data architecture and modeling strategy for AI projects. Define how data is stored, organized, and accessed. Select technologies, design schemas/formats, and ensure systems support scalable AI and analytics workloads.
- Build Scalable Data Pipelines: Lead development of robust ETL/ELT workflows and data models. Build pipelines that move large datasets with high reliability and low latency to support training and inference for AI and generative AI systems.
- Ensure Data Quality & Governance: Oversee data governance and compliance with internal standards and regulations. Implement data anonymization, quality checks, lineage, and controls for handling sensitive information.
- Provide Technical Leadership: Offer handson leadership across data engineering projects. Conduct code reviews, enforce best practices, and promote clean, welltested code. Introduce improvements in development processes and tooling.
- CrossFunctional Collaboration: Work closely with engineers, scientists, and product stakeholders. Scope work, manage data deliverables in agile sprints, and ensure timely delivery of data components aligned with project milestones.
What You'll Bring
- Data Engineering Experience: Experience in data engineering or backend engineering. Experience designing endtoend data systems, solving scale/performance challenges, integrating diverse sources, and operating pipelines in production would be a plus.
- Big Data & Cloud Expertise: Strong skills in Python and/or Java/Scala. Deep experience with Spark, Hadoop, Hive/Impala, and Airflow. Handson work with AWS, Azure, or GCP using cloudnative processing and storage services (e.g., S3, Glue, EMR, Data Factory). Ability to design scalable, costefficient workloads for experimental and variable R&D environments.
- AI/ML Data Lifecycle Knowledge: Understanding of data needs for machine learningdataset preparation, feature/label management, and supporting realtime or batch training pipelines. Experience with feature stores or streaming data is useful.
- Leadership & Mentorship: Ability to translate ambiguous goals into clear plans, guide engineers, and lead technical execution.
- ProblemSolving Mindset: Approach issues systematically, using analysis and data to select scalable, maintainable solutions.
Required Skills
- Education & Background: Bachelor's degree in Computer Science, Engineering, or related field. 5+ years of proven experience architecting and operating productiongrade data systems, especially those supporting analytics or ML workloads.
- Pipeline Development: Expert in ETL/ELT design and implementation, working with diverse data sources, transformations, and targets. Strong experience scheduling and orchestrating pipelines using Airflow or similar tools.
- Programming & Databases: Advanced Python and/or Scala/Java skills and strong software engineering fundamentals (version control, CI, code reviews). Excellent SQL abilities, including performance tuning on large datasets.
- Big Data Technologies: Handson Spark experience (RDDs/DataFrames, optimization). Familiar with Hadoop components (HDFS, YARN), Hive/Impala, and streaming systems like Kafka or Kinesis.
- Cloud Infrastructure: Experience deploying data systems on AWS/Azure/GCP. Familiar with cloud data lakes, warehouses (Redshift, BigQuery, Snowflake), and cloudbased processing engines (EMR, Dataproc, Glue, Synapse). Comfortable with Linux and shell scripting.
- Data Governance & Security: Knowledge of data privacy regulations, PII handling, access controls, encryption/masking, and data quality validation. Experience with metadata management or data cataloging tools is a plus.
- Collaboration & Agile Delivery: Strong communication skills and experience working with crossfunctional teams. Ability to document designs clearly and deliver iteratively using agile practices.
Preferred Skills
- Advanced Cloud & Data Platform Expertise: Experience with AWS data engineering services, Databricks, and Lakehouse/Delta Lake architectures (including bronze/silver/gold layers).
- Modern Data Stack: Familiarity with dbt, Great Expectations, containerization (Docker/Kubernetes), and monitoring tools like Grafana or cloudnative monitoring.
- DevOps & CI/CD for Data: Experience implementing CI/CD pipelines for data workflows and using IaC tools like Terraform or CloudFormation. Knowledge of data versioning (e.g., Delta Lake timetravel) and supporting continuous delivery for ML systems.
- Continuous Learning: Motivation to explore emerging technologies, especially in AI and generative AI data workflows.
Corporate Security Responsibility
All Activities Involving Access To Mastercard Assets, Information, And Networks Comes With An Inherent Risk To The Organization And, Therefore, It Is Expected That Every Person Working For, Or On Behalf Of, Mastercard Is Responsible For Information Security And Must:
- Abide by Mastercard's security policies and practices;
- Ensure the confidentiality and integrity of the information being accessed;
- Report any suspected information security violation or breach, and
- Complete all periodic mandatory security trainings in accordance with Mastercard's guidelines.