Search by job, company or skills

I

Cloud Technical Manager

5-8 Years
SGD 10,000 - 12,000 per month
new job description bg glownew job description bg glownew job description bg svg
  • Posted 22 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Key Responsibilities

Multi-Cloud Infrastructure Leadership & Architecture

. Lead the design, deployment, and management of cloud-native architectures across AWS, Microsoft Azure, and Google Cloud Platform in production environments

. Architect and implement scalable, highly available, and secure multi-cloud solutions aligned with business requirements and government compliance standards

. Provide technical leadership for cloud services including: EC2, S3, Lambda, ECS/EKS, RDS, CloudWatch, Systems Manager, Azure Virtual Machines, Azure Kubernetes Service (AKS), Azure Monitor, Compute Engine, Google Kubernetes Engine (GKE), Cloud Functions, Cloud Storage, and Cloud Monitoring

. Design and implement infrastructure architecture for new application deployments, ensuring best practices in scalability, performance, and cost optimization

. Evaluate and recommend cloud technologies, services, and architectural patterns to support business objectives and digital transformation initiatives

. Lead migration initiatives from on-premises to cloud and cloud-to-cloud migrations across AWS, Azure, and GCP

. Monitor and optimize cloud resource utilization, implementing cost management strategies and right-sizing recommendations

Technical Team Leadership & Mentorship

. Provide technical leadership, guidance, and mentorship to L2 Linux Engineers, L2 Wintel Engineers, and L3 Cloud Engineers

. Conduct technical design reviews, code reviews for Infrastructure as Code (IaC), and architectural assessments

. Act as the technical escalation point for complex infrastructure issues requiring advanced troubleshooting and resolution

. Drive knowledge transfer initiatives, facilitate technical training sessions, and develop engineering team capabilities

. Lead incident response for critical production issues, coordinating cross-functional teams and ensuring rapid resolution

. Foster a culture of operational excellence, automation, continuous improvement, and technical innovation

. Participate in 24/7 shift rotation and on-call escalation support to provide leadership during critical incidents

Operating System Lifecycle & Patch Management

. Oversee and coordinate enterprise-wide OS patching operations across RHEL (v7 to v10) and Windows Server (2016 to 2025) environments using native tools eg. AWS Systems Manager, Azure Update Management, WSUS, SCCM, and YUM/DNF

. Demonstrate advanced proficiency in both Linux and Windows system administration with the ability to troubleshoot complex issues across both platforms

. Develop and enforce patching strategies, policies, and schedules aligned with security compliance requirements and business continuity objectives

. Lead monthly and quarterly patch cycles, ensuring comprehensive testing, validation, and rollback procedures

. Coordinate patch approvals with Change Advisory Board (CAB) and manage stakeholder communications throughout patching activities

. Execute post-patch validation, remediation activities, and compliance reporting for audit requirements

. Identify and manage End-of-Life (EOL) operating systems and applications, planning upgrade and migration strategies

Security Hardening & Compliance Management

. Lead CIS (Center for Internet Security) security hardening initiatives and remediation activities across all cloud platforms and operating systems

. Implement and maintain security baselines based on CIS Benchmarks, government security standards (IM8 Policy), and industry best practices

. Oversee vulnerability management programs using tools such as Trend Micro, Qualys, Tenable, and AWS Config

. Prioritize, coordinate, and track security remediation efforts across infrastructure teams to ensure timely resolution of vulnerabilities

. Manage SSL/TLS certificate lifecycle, including renewals, implementation, and monitoring across multi-cloud environments

. Ensure compliance with government-level security, audit, and regulatory requirements including SOC 2, ISO 27001, and Singapore government frameworks

. Collaborate with InfoSec teams on security assessments, penetration testing, and audit preparations

. Implement and maintain security monitoring, logging, and alerting mechanisms using native cloud tools and third-party solutions

Infrastructure as Code (IaC) & Automation

. Lead Infrastructure as Code initiatives using Terraform, Ansible, AWS CloudFormation, and Azure Resource Manager (ARM) templates

. Design and implement automated infrastructure deployment pipelines with CI/CD integration

. Troubleshoot complex environment drift, pipeline failures, and infrastructure provisioning issues across multi-cloud environments

. Implement and maintain GitOps practices for infrastructure deployment and version control

. Drive automation initiatives to reduce manual operational overhead and improve infrastructure reliability

ITIL Process Management & Service Delivery

. Oversee ITIL processes including Incident Management, Problem Management, Change Management, and Request Management

. Manage and optimize ITSM workflows using ServiceNow, Jira, or similar enterprise ITSM platforms

. Lead Change Advisory Board (CAB) reviews for infrastructure changes, providing technical assessment and risk analysis

. Drive incident escalation processes, root cause analysis (RCA), and Post-Incident Review (PIR) activities

. Ensure compliance with Service Level Agreements (SLAs) and Operational Level Agreements (OLAs)

. Implement continuous service improvement initiatives based on operational metrics, KPIs, and stakeholder feedback

. Maintain comprehensive documentation including runbooks, standard operating procedures (SOPs), and architectural diagrams

Stakeholder Management & Communication

. Act as the primary technical liaison between infrastructure teams and business stakeholders, application owners, and senior management

. Manage expectations and communicate technical concepts effectively to both technical and non-technical audiences

. Coordinate with cross-functional teams including Development, Security, Networking, and Database teams on infrastructure initiatives

. Lead technical discussions, architecture reviews, and solution design sessions with stakeholders

. Provide regular status updates, operational reports, and capacity planning recommendations to management

. Manage vendor relationships for cloud services, security tools, and infrastructure platforms

. Facilitate communication during critical incidents, ensuring timely updates to all stakeholders and maintaining service transparency

Container Orchestration & DevSecOps

. Provide technical leadership for containerization initiatives using Docker, Kubernetes, Amazon ECS, Amazon EKS, Azure AKS, and Google GKE

. Implement and maintain DevSecOps practices with SHIP-HATS (Secure Hybrid Integration Pipeline - Hive Agile Testing Solutions) within Singapore Government technology stack

. Oversee CI/CD pipeline operations, integrating security scanning tools including SAST, DAST, and container vulnerability scanning

. Drive containerization strategy and microservices architecture adoption across application portfolios

Monitoring, Observability & Performance Optimization

. Design and implement comprehensive monitoring, logging, and alerting strategies using CloudWatch, Azure Monitor, GCP Cloud Monitoring, and third-party observability platforms

. Configure and maintain observability stacks for metrics, logs, traces, and alerts across multi-cloud environments

. Implement log aggregation and analysis using centralized logging solutions

. Lead performance optimization initiatives, conducting capacity planning and resource right-sizing activities

. Establish operational dashboards, reporting mechanisms, and proactive alerting for infrastructure health and performance

Documentation & Knowledge Management

. Create and maintain comprehensive infrastructure documentation, including system architecture diagrams, network topology, and data flow diagrams

. Develop and maintain technical runbooks, troubleshooting guides, and disaster recovery procedures

. Ensure audit-readiness through meticulous documentation discipline and change tracking

. Maintain Configuration Management Database (CMDB) accuracy and asset inventories

. Build and maintain knowledge base articles, FAQs, and best practice documentation for team reference

Required Qualifications

Education & Experience

. Bachelor's degree in Computer Science, Information Systems, Information Technology, or related technical field

. Minimum 5 years of experience in infrastructure and cloud engineering roles with progressive leadership responsibilities

. At least 5 years of hands-on experience managing multi-cloud environments across AWS, Microsoft Azure, and Google Cloud Platform

. Minimum 1 years of experience in regulated environments such as public sector, government, financial services, or healthcare

. Proven experience in 24/7 operational support environments with incident management and on-call responsibilities

. Demonstrated experience leading technical teams, mentoring engineers, and driving operational excellence initiatives

Technical Skills & Expertise

. Multi-Cloud Platforms: Expert-level proficiency in Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP) with hands-on experience across compute, storage, networking, security, and managed services

. Operating Systems: Advanced expertise in both Linux (RHEL, CentOS, Ubuntu, Amazon Linux) and Windows Server administration (2016, 2019, 2022, 2025) with deep troubleshooting capabilities

. Patch Management: Extensive experience with enterprise patch management using AWS Systems Manager, Azure Update Management, WSUS, SCCM, and YUM/DNF

. Security Hardening: Strong background in CIS Benchmark implementation, security remediation, and compliance frameworks (IM8, SOC 2, ISO 27001)

. Infrastructure as Code: Proficiency in Terraform, Ansible, AWS CloudFormation, ARM templates, and GitOps practices

. Container Technologies: Experience with Docker, Kubernetes, Amazon ECS/EKS, Azure AKS, Google GKE, and container orchestration

. ITIL & ITSM: Deep understanding of ITIL v3/v4 processes with hands-on experience using ServiceNow, Jira, or similar ITSM platforms

. DevSecOps: Experience with CI/CD pipelines, security scanning integration, and familiarity with SHIP-HATS platform

. Scripting & Automation: Proficiency in PowerShell, Bash/Shell scripting, Python for automation and infrastructure operations

. Monitoring & Observability: Experience with CloudWatch, Azure Monitor, GCP Cloud Monitoring, Prometheus, Grafana, ELK Stack, or similar platforms

Preferred Certifications

. AWS Certified Solutions Architect - Professional or AWS Certified DevOps Engineer - Professional

. Microsoft Certified: Azure Solutions Architect Expert

. Google Cloud Professional Cloud Architect

. Red Hat Certified Engineer (RHCE) or Red Hat Certified Architect (RHCA)

. Microsoft Certified: Windows Server Hybrid Administrator Associate

. ITIL v4 Foundation or ITIL Expert

. Certified Kubernetes Administrator (CKA) or Certified Kubernetes Security Specialist (CKS)

. HashiCorp Certified: Terraform Associate or Professional

Soft Skills & Competencies

. Technical Leadership: Demonstrated ability to lead technical initiatives, provide architectural guidance, and mentor engineering teams

. Stakeholder Management: Ability to manage relationships with diverse stakeholders, from technical teams to executive leadership

. Communication: Outstanding verbal and written communication skills with ability to articulate complex technical concepts to non-technical audiences

. Problem Solving: Advanced analytical and troubleshooting capabilities with systematic approach to complex multi-cloud infrastructure challenges

. Strategic Thinking: Ability to balance immediate operational needs with long-term infrastructure strategy and roadmap planning

. Collaboration: Strong teamwork and cross-functional collaboration skills with experience working across development, security, and operations teams

. Adaptability: Agile and responsive to rapidly changing technology landscapes, business requirements, and operational demands

. Accountability: Takes ownership of outcomes, demonstrates attention to detail, and ensures accurate and secure infrastructure implementations

. Customer Focus: Service-oriented mindset with commitment to delivering high-quality solutions that meet business and user needs

. Continuous Learning: Commitment to staying current with evolving cloud technologies, security practices, and industry best practices

. Mentorship: Proven ability to develop and support junior and mid-level engineers, fostering technical growth and career development

Technical Manager Role Expectations

The Technical Manager position requires:

. L3+ level technical proficiency with hands-on expertise across multi-cloud platforms, Linux, and Windows environments

. Proven experience architecting and deploying new infrastructure solutions in AWS, Azure, and GCP

. Strong technical leadership with the ability to lead L2 and L3 engineers through complex technical challenges

. Deep understanding of security hardening, CIS remediation, and compliance frameworks

. Exceptional stakeholder management capabilities with experience interfacing with senior leadership

. Proactive approach to incident prevention, operational excellence, and continuous improvement

. Calm, structured, and methodical incident handling with strict adherence to ITIL processes

. Audit-readiness mindset with comprehensive documentation practices

. Experience working within Singapore Government technology frameworks and compliance requirements

. Ability to drive escalations effectively and manage critical stakeholder communications during incidents

Work Arrangements

. This role requires participation in 24/7 shift rotation and on-call escalation support for critical infrastructure operations

. Extended work hours may be required during major incidents, maintenance windows, and change implementations

. On-call support responsibilities as part of senior leadership rotation schedule

. Flexibility to work outside normal office hours for patching activities, architecture deployments, and emergency response

. May require occasional travel for stakeholder meetings, vendor engagements, or cross-site coordination

More Info

Job Type:
Industry:
Employment Type:

Job ID: 132820791