We are seeking a skilled and experienced Cloud Engineer (SRE - Level 2) to support a secure and scalable cloud infrastructure for a Singapore Government-appointed agency operating on commercial cloud platforms. This Subject Matter Expert (SME) role requires real-world experience in managing multi-service cloud environments using AWS, strong Infrastructure-as-Code (IaC) capabilities, lifecycle and patching operations, and proactive operational support. The ideal candidate will be hands-on, security-aware, automation-driven, and familiar with strict uptime, compliance, and audit requirements.
Key Responsibilities:
Cloud Infrastructure Operations
- Operate and maintain AWS-native services in production, including Lambda, ECS, EKS, FSx, Redshift, Glue, Neptune, SES, GuardDuty, WAF, Shield Advanced, Security Hub, KMS, Secret Manager, SNS, SQS, EventBridge, SageMaker, API Gateway.
- Monitor and troubleshoot infrastructure performance, uptime and scalability.
- Support production and staging environments with 24/7 reliability objectives.
Infrastructure as Code (IaC)
- Design and maintain infrastructure deployment pipelines using Terraform, CloudFormation, and Ansible.
- Troubleshooot environment drift and pipeline failures.
- Promote automation in cloud operations.
OS Lifecycle & Patch Management
- Manage patching across RHEL (v8 to v9) and Windows Server (20162025) using AWS Patch Manager, WSUS, and YUM/DNF.
- Schedule, automate, and track patches.
- Coordinate approvals and ensure compliance.
SSL and EOL Management
- Track SSL certificate renewals across environments.
- Identify and remediate EOL components like OS versions and Lambda runtimes.
Tool Integration & Interoperability
- Integrate third party tools like NGINX, monitoring dashboards and observability stacks.
- Provide inputs to SRE-managed observability tools for metrics, logs, and alerts.
Documentation & Compliance
- Create and maintain infrastructure runbooks, system documentation, and change tracking logs.
- Adhere to and support government-level security, audit, and compliance protocols.