- Design and implement automation frameworks to streamline operational tasks for data platforms
- Collaborate with Data Platform Engineers, data product teams
- Develop and maintain Infrastructure-as-Code (IaC) solutions
We are hiring a Site Reliability Engineer (SRE) to manage, support, and enhance enterprise data platforms. This role focuses on platform reliability, automation, and integration, ensuring scalability, stability, and compliance in a dynamic and fast-paced environment.
The Position
- Design and implement automation frameworks to streamline operational tasks for data platforms (e.g., provisioning, configuration, monitoring, and incident remediation).
- Collaborate with Data Platform Engineers, data product teams, and business stakeholders to ensure reliability and performance of data platforms.
- Develop and maintain Infrastructure-as-Code (IaC) solutions for deploying and managing data platform components across environments.
- Establish robust monitoring, alerting, and observability systems to proactively detect and resolve issues.
- Drive incident management processes, including root cause analysis and post-mortem reviews, to improve platform stability.
- Ensure compliance with IT security and regulatory standards in all automation and operational workflows.
- Partner with vendors and internal teams to integrate automation tools with existing data platforms and organizational ecosystem.
- Advocate for and implement best practices in CI/CD pipelines for data platform services.
- Continuously identify opportunities for operational improvements and reliability enhancements through automation.
The Candidate
- At least 5 years experience of working as a site reliability or data engineer or a software engineer or a data engineer.
- Strong practical expertise in leveraging Ansible and Python for automation across enterprise data platforms.
- Hands-on experience and working knowledge of application implementation to implement new platforms.
- Familiar with application integration using RDBMS.
- Exposure and knowledge in the following technologies will be advantageous:
- Snowflake, Oracle, MS SQL, Denodo
- AWS services
- Linux (or Unix-like OS)
- Data platform experience (Informatica, Tableau, Power BI) will be advantageous but not mandatory; training will be provided.
- Experience in the Systems Development Life Cycle (SDLC) implementation methodology and/or Agile methodologies like Scrum and Kanban.
- Good understanding and able to apply good industry practice of code versioning, testing, CICD workflow, and code documentation.
- Good team player, with strong analytical skill and enjoy complex problem solving with innovative ideas.
- Good communication/people skills and able to interact with data analysts, business end-users and vendors to design and develop solutions.
- Good at working with details and is meticulous for operations.
Preferred Qualifications
- Bachelor's degree in computer science or a related field, or equivalent practical experience.
We regret to inform that only shortlisted candidates will be notified.
EA Reg No: R25145981, Low Ciao Ling
Allegis Group Singapore Pte Ltd, Company Reg No. 200909448N, EA Licence No. 10C4544,