SRE Manager

Apple

Singapore

10-12 Years

Save

Posted an hour ago
Be among the first 10 applicants

Early Applicant

Job Description

Summary

We're looking for a Site Reliability Engineering (SRE) Manager with strong architectural experience to join the JMET SRE Team. You'll play a key role in leading SRE teams, designing and scaling reliable, secure, and high-performance infrastructure across our cloud and hybrid environments. You'll be responsible for establishing reliability patterns, driving large-scale systems design, and building automation frameworks to support production systems at scale.

Description

This is a hands-on leadership role with architectural ownership, strategic influence, and deep technical impact across multiple domains — including application and infrastructure security, incident response engineering, and resilience automation.

Responsibilities

Architect Scalable Infrastructure: Design, evolve, and review highly reliable, performant, and cost-efficient cloud-native and hybrid infrastructure using Infrastructure as Code (IaC), containers, and microservices principles.
Support Cryptographic Systems at Scale: Design and operationalize scalable, secure integrations with Hardware Security Modules (HSMs) for sensitive workloads, key management, and cryptographic operations.
Drive SRE Best Practices: Define and implement service-level indicators (SLIs), objectives (SLOs), and agreements (SLAs) to guide engineering teams toward reliability and observability goals.
Incident Architecture and Prevention: Serve as a technical lead during major incidents. Partner with security and platform teams to conduct thorough post-incident reviews, drive systemic improvements, and establish preventive architectural controls.
System Design and Tooling: Build and maintain reusable tooling, automation frameworks, and reliability platforms — including observability, alerting, chaos testing, auto-scaling, and failover.
Reliability as Code: Champion resilience engineering through automation pipelines, CI/CD integrations, canary releases, and chaos engineering principles.
Multi-Cloud and Hybrid Systems: Design, assess, and guide architecture decisions across AWS, GCP, AliCloud, and on-premises infrastructure. Ensure consistency, interoperability, and regulatory compliance.
Security and Compliance: Ensure architectural patterns align with security standards, compliance requirements, and audit readiness.

Minimum Qualifications

10 or more years of experience in SRE, DevOps, or Infrastructure Engineering roles, with 2 or more years in a managerial capacity.
Deep expertise in cloud infrastructure (AWS, GCP, or AliCloud) and container orchestration (Kubernetes, EKS).
Proven experience with Infrastructure as Code tools such as Terraform and CloudFormation.
Strong understanding of distributed systems, networking, and systems design at scale.
Proficiency in at least one programming or scripting language, such as Python, Go, or Bash.

Preferred Qualifications

Solid background in CI/CD tools and modern deployment strategies, for example Spinnaker and GitOps.
Familiarity with security best practices in cloud and containerized environments.
Experience with HSMs and cryptographic operations at scale is a plus.

Apple is an equal opportunity employer that is committed to inclusion and diversity. Apple provides reasonable accommodations to applicants with disabilities and in accordance with local requirements. Apple is a drug-free workplace.