We are seeking a skilled and experienced Cloud Engineer (SRE - Level 2) to support a secure and scalable cloud infrastructure for a Singapore Government-appointed agency operating on commercial cloud platforms. This Subject Matter Expert (SME) role requires real-world experience in managing multi-service cloud environments using AWS, strong Infrastructure-as-Code (IaC) capabilities, lifecycle and patching operations, and proactive operational support. The ideal candidate will be hands-on, security-aware, automation-driven, and familiar with strict uptime, compliance, and audit requirements.
Key Responsibilities:
Cloud Infrastructure Operations
- Operate and maintain AWS-native services in production, including Lambda, ECS, EKS, FSx, Redshift, Glue, Neptune, SES, GuardDuty, WAF, Shield Advanced, Security Hub, KMS, Secret Manager, SNS, SQS, EventBridge, SageMaker, API Gateway.
- Monitor and troubleshoot infrastructure performance, uptime and scalability.
- Support production and staging environments with 24/7 reliability objectives.
Infrastructure as Code (IaC)
- Design and maintain infrastructure deployment pipelines using Terraform, CloudFormation, and Ansible.
- Troubleshooot environment drift and pipeline failures.
- Promote automation in cloud operations.
OS Lifecycle & Patch Management
- Manage patching across RHEL (v8 to v9) and Windows Server (20162025) using AWS Patch Manager, WSUS, and YUM/DNF.
- Schedule, automate, and track patches.
- Coordinate approvals and ensure compliance.
SSL and EOL Management
- Track SSL certificate renewals across environments.
- Identify and remediate EOL components like OS versions and Lambda runtimes.
Tool Integration & Interoperability
- Integrate third party tools like NGINX, monitoring dashboards and observability stacks.
- Provide inputs to SRE-managed observability tools for metrics, logs, and alerts.
Documentation & Compliance
- Create and maintain infrastructure runbooks, system documentation, and change tracking logs.
- Adhere to and support government-level security, audit, and compliance protocols.
SME Expectations - Role Behavior
This Subject Matter Expert (SME) role requires:
- Proficiency across AWS services and architecture.
- Experience in uptime-critical and compliant environments.
- Mentorship capabilities for junior engineers.
- Initiative and proactive incident prevention.
- Calm and structured incident handling.
- Adherence to change and incident processes.
- Audit-readiness and documentation discipline.
Technical Skills & Experience
Area
Skills Required
. Cloud
. Hands-on with AWS services in production
. IaC
. Terraform, CloudFormation, Ansible
. OS Platforms
. RHEL & Windows Server administration
. Patching
. AWS Patch Manager, WSUS, YUM/DNF
. SSL/EOL
. End-to-end management of SSL certificates and lifecycle tracking
. Documentation
. Skilled in producing detailed runbooks, tracking sheets, and change logs
Required Qualifications
- Bachelor's degree in computer science, Information Systems, or related field.
- Minimum 10 years experience in DevOps / SRE roles, with at least 5 years on public sector or regulated Cloud environment
- AWS Certified Solutions Architect - Associate or Professional
- RHCE / Windows Server Certification
- Experience in regulated or compliance driven environments (GovTech, Healthcare, Banking, etc.)
- Familiarity with ITIL and change management process