SRE Engineer (Azure)

LEAPTHOUGHT ASIA PTE. LTD.

Singapore, Robinson

3-6 Years

SGD 8,500 - 10,000 per month

This job is no longer accepting applications

Posted 2 months ago

Job Description

Company Overview / Employee Value Proposition

LeapThought is a leading provider of strategic consulting services and business technology solutions focused on driving transformational outcomes. We design and build platforms that enable businesses to be more efficient and deliver frictionless experiences for more customers. Everyday, tens of thousands of New Zealanders use services developed by LeapThought to manage essential daily activities - from booking transport and paying power bills to interacting with their local council. Our core philosophy of do it once reflects how common problems can be tackled with repeatable frameworks and accelerators. This makes better technology solutions that more accessible and are more cost effective for everyone. We advocate and follow an inclusive business driven approach at all times connecting the strategic, operational and technical viewpoints together to ensure our on-going success in delivering results for our customers. While we are agile we easily fit in with our customer preferences in terms of methodologies, standards and tools. We are privately held and based in New Zealand with offices in Auckland, Singapore, Sydney and Hyderabad, India.

About the Role

We are seeking an experienced SRE Engineer to ensure the reliability, availability, and performance of our Azure cloud infrastructure. You will drive automation, monitoring, and operational excellence across our platforms.

Key Responsibilities

Maintain and optimize Azure infrastructure to ensure high reliability and performance for business-critical applications
Design and implement monitoring, alerting, dashboards, and SLO/SLA frameworks to proactively manage system health and uptime
Lead incident response efforts, conduct root cause analysis, and implement continuous improvements to prevent recurrence
Develop and maintain automation scripts and infrastructure as code using PowerShell, Python, Terraform, or Bicep to streamline operations
Apply reliability engineering patterns such as autoscaling, redundancy, and failover to enhance system resilience
Collaborate with cross-functional engineering teams to ensure deployment processes and operational readiness meet reliability standards
Participate actively in on-call rotation to provide timely response and resolution for infrastructure issues

Requirements

Demonstrated 3-6 years of hands-on experience in Azure Site Reliability Engineering, Cloud Operations, or Infrastructure roles
Proficient knowledge and practical experience with Azure Monitor, Log Analytics, Networking, and Azure Kubernetes Service (AKS)
Skilled in infrastructure as code (IaC) tools such as Terraform or Bicep and scripting languages including PowerShell and Python
Proven ability to design and operate reliability-focused cloud infrastructure environments
Willingness and ability to participate in an on-call rotation to support 24/7 operational needs