Mid-Level QA / Testing Automation Engineer
Salary up to 7000 SGD
Contract: 1 year
Job Summary
We are seeking a skilled QA / Testing Automation Engineer with strong experience in Big Data and cloud-based data platforms to ensure the accuracy, integrity, and reliability of large-scale data pipelines. This role focuses on validating complex data transformations within Lakehouse architectures, particularly in environments leveraging Hadoop, Databricks, and distributed systems. The ideal candidate will bring a data-first testing mindset, with proven expertise in automating data validation using Python frameworks and advanced SQL techniques across high-volume datasets.
You will be responsible for designing and implementing robust test strategies, replacing manual validation with automated testing integrated into CI/CD pipelines, and ensuring data quality across end-to-end workflows. Working closely with Data Engineers, you will perform defect analysis, regression testing, and performance validation to maintain data consistency and support scalable, high-performing data platforms. This role is ideal for candidates with strong analytical skills, hands-on experience in PySpark and data testing frameworks, and a solid understanding of distributed data processing and governance practices.
Job Responsibilities
- Develop Test Strategy: Create a comprehensive test plan for the Lakehouse, focusing on Data Integrity, Accuracy, and Consistency.
- Automate Validation: Replace manual spot-checking with automated Python test suites that run as part of the CI/CD pipeline.
- Defect Analysis: Identify and document data anomalies, working closely with Data Engineers to perform root-cause analysis on Spark job failures.
- Regression Testing: Ensure that new PySpark code deployments do not impact existing Gold layer business logic or dashboard reporting.
Job Requirements
- Total QA/Testing Experience: 5+ years.
- Data Testing Experience: 3+ years specifically in Big Data, Hadoop, or Cloud Data Warehouse environments.
- Good to have : Databricks Experience: 1+ years of experience testing pipelines within a Databricks environment.
- Automation Focus: Proven track record of moving from manual SQL checks to automated Python-based testing frameworks.
- Migration automation test experience using Python
- Required Certifications
- Good to have: Databricks Certified Data Engineer Associate (at minimum).
- Preferred: ISTQB Foundation or Advanced Level (Test Automation Engineer).
- Core Technical Skills
- Data Validation & Frameworks
- Great Expectations / Pandera: Proficiency in using Python-based libraries to define data contracts and automated validation suites.
- DLT Expectations: Deep understanding of Delta Live Tables (DLT) expectations (Fail, Drop, Quarantining bad records).
- Advanced SQL: Expert-level SQL for complex data reconciliation, identifying duplicates, and null-value analysis across billions of records.
- Python for QA (PySpark)
- Pytest-Spark: Experience using pytest to write unit tests for PySpark transformations and logic.
- Notebook Testing: Ability to write automated test notebooks that validate Medallion Architecture transitions (Bronze to Silver, Silver to Gold).
- Data Reconciliation: Building Python scripts to perform source-to-target counts and checksums across distributed file systems.
- Performance & Integration Testing
- Scalability Testing: Ability to validate that data pipelines meet performance SLAs when data volume spikes.
- End-to-End Orchestration Testing: Testing the reliability of Databricks Workflows and handling of job failures/retries.
- Schema Evolution: Testing how pipelines handle upstream schema changes without breaking downstream Gold tables.
- Governance & Security Testing
- Unity Catalog Validation: Testing Row-Level Security (RLS) and Column-Level Masking to ensure unauthorized users cannot see sensitive data.
- Data Lineage: Validating that data lineage in Unity Catalog correctly reflects the movement of data across the Lakehouse.
About CLPS RiDiK
RiDiK is a global technology solutions provider and a subsidiary of CLPS Incorporation (NASDAQ: CLPS), delivering cutting-edge end-to-end services across banking, wealth management, and e-commerce. With deep expertise in AI, cloud, big data, and blockchain, we support clients across Asia, North America, and the Middle East in driving digital transformation and achieving sustainable growth. Operating from regional hubs in 10 countries and backed by a global delivery network, we combine local insight with technical excellence to deliver real, measurable impact. Join RiDiK and be part of an innovative, fast-growing team shaping the future of technology across industries.
We will review applications on a rolling basis until 30 April 2026, and early submissions are encouraged. Please note that only shortlisted candidates will be contacted. Thank you for your understanding.