We are seeking to hire an experienced Cloud Operations Engineer to join team to provide on-site services for a client in Singapore.In this role, you will be responsible for the day-to-day operations, reliability, and performance of our private cloud environments and hybrid cloud systems. You will work hands-on with physical and virtualized infrastructure, ensuring high availability, security, and scalability across all platforms.
Key Responsibilities
- Design, deploy, and manage private cloud infrastructure using platforms such as OpenStack, VMware vSphere/vCloud, or Nutanix.
- Monitor cloud environments using tools such as Prometheus, Grafana, or Zabbix, ensuring uptime and performance SLAs are met.
- Perform capacity planning, resource provisioning, and lifecycle management of private cloud environments.
- Automate operational tasks using scripting (Python, Bash, PowerShell) and infrastructure-as-code tools (Terraform, Ansible).
- Manage virtualization platforms, storage systems (SAN/NAS), and networking components (VLANs, SDN, load balancers).
- Implement and maintain security controls including identity management, network segmentation, patch management, and compliance audits.
- Collaborate with DevOps and application teams to define and enforce cloud governance standards and best practices.
- Respond to and resolve incidents affecting cloud infrastructure conduct root cause analysis and implement preventive measures.
- Maintain comprehensive documentation of architecture, configurations, runbooks, and change management records.
- Evaluate and integrate new cloud technologies aligned with business and operational requirements.
Required Qualifications
- Bachelor's degree in Computer Science, Information Technology, or a related field, or equivalent work experience.
- 3+ years of experience in cloud operations, systems engineering, or infrastructure administration.
- Hands-on experience managing private cloud platforms (OpenStack, VMware vSphere, Nutanix, or similar).
- Solid understanding of virtualization technologies (KVM, VMware ESXi, Hyper-V).
- Experience with networking fundamentals: TCP/IP, DNS, DHCP, VLANs, firewalls, VPNs, and load balancing.
- Proficiency in at least one scripting or automation language (Python, Bash, PowerShell, Ansible, Terraform).
- Strong troubleshooting and diagnostic skills across OS, network, and application layers.
- Familiarity with ITIL principles, change management, and incident management processes.