Senior Cloud Operations Engineer

CYBERBOT PTE. LTD.

Singapore, Marina

4-6 Years

SGD 7,000 - 10,000 per month

Save

Posted 2 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

The Role:

The Senior Cloud Operations Engineer is responsible for the architectural stability, scalability, and security of the company's mission-critical cloud infrastructure. This role focuses on high-availability (HA) management for financial-grade systems, leveraging automation and cloud-native technologies to ensure seamless service delivery. You will bridge the gap between development and operations by building robust internal tools and monitoring ecosystems.

Job Responsibilities:

Multi-Cloud Management: Lead the maintenance and architectural optimization of the company's public cloud platforms (Primary: AWS, Secondary: GCP).
Financial-Grade Reliability: Ensure core business systems meet financial-level SLA requirements, including disaster recovery, high-availability design, and routine stress testing.
Infrastructure as Code (IaC): Manage and optimize internal business systems, including Jira, Confluence, Docker registries, and Kubernetes (K8s) clusters.
Internal Tooling Development: Design, develop, and maintain internal automation systems using Golang, including unified monitoring/alerting platforms, centralized logging centers, and CI/CD pipelines.
Security & Compliance: Manage cloud resources, IAM accounts, and user permissions to meet strict financial security auditing and compliance standards.
Incident Management: Lead troubleshooting for complex network and system issues, participating in on-call rotations to ensure 24/7 system uptime.

Job Requirement:

Education: Bachelor's degree in Computer Science, Software Engineering, or a related technical field.
Experience: 4-5 years of solid experience in Cloud Operations, DevOps, or SRE roles.
Cloud Proficiency: Expert knowledge of Linux administration and AWS ecosystem (EC2, VPC, EKS, RDS, IAM).
Programming: Proficient in Golang (essential for tool development) and scripting languages such as Python and Shell.
Containerization: Extensive hands-on experience in managing and scaling production Kubernetes (K8s) environments.
Observability: Familiar with open-source monitoring/alerting ( Prometheus & Grafana) and logging systems (Loki, Graylog, or ELK).
Networking: Strong foundation in networking protocols (TCP/IP, BGP, VPN, Load Balancing) with the ability to diagnose complex connectivity issues.
Industry Context: Prior experience in Financial Services (FinTech, Banking, or Payments) is highly preferred.
Soft Skills: Proven ability to work under pressure with tight timelines proactive team player with strong professional ethics.