About the Role
We are looking for a hands-on Operations Support Engineer to support the daily operations, stability, and performance of cloud and virtualised platforms across development, staging, and production environments. This role focuses on system monitoring, incident response, automation support, and maintaining secure, reliable platform services.
Key Responsibilities:
- Support and maintain cloud and virtualised infrastructure including compute, storage, networking, container platforms, and monitoring systems.
- Administer and support virtualisation platforms such as VMware vSphere and Microsoft Hyper-V.
- Perform Linux and Windows server administration, including configuration, patching, performance tuning, and maintenance.
- Monitor system health and performance using observability tools such as CloudWatch, Prometheus, Grafana, and ELK stack investigate alerts and resolve incidents.
- Conduct troubleshooting, root cause analysis, and service recovery to maintain system availability and reliability.
- Implement and maintain monitoring dashboards, alerts, and operational checks for proactive issue detection.
- Support security and compliance requirements through access management, system hardening, vulnerability remediation, and governance tools (e.g., CyberArk).
- Execute and maintain automation and Infrastructure-as-Code deployments using Terraform, Ansible, and CloudFormation.
- Support change, release, and incident management processes, including post-incident reviews and follow-up actions.
- Work closely with application and DevOps teams to support deployments and environment readiness.
- Maintain operational documentation, SOPs, and runbooks.
Requirements:
- At least 4 years of relevant experience in an operations or infrastructure support environment.
- Hands-on experience with virtualisation technologies (VMware vSphere and/or Microsoft Hyper-V).
- System administration experience in both Linux and Windows environments.
- Exposure to cloud infrastructure environments (AWS preferred).
- Familiarity with monitoring and logging tools (e.g., CloudWatch, Prometheus, Grafana, ELK).
- Experience with scripting or automation tools (e.g., Bash, Python, Ansible, Terraform, or similar).
- Understanding of security practices such as access control, patching, and vulnerability remediation.
- Strong troubleshooting, incident handling, and problem-solving skills.
- Ability to support operational environments
Interested candidates are encouraged to submit their resumes outlining their relevant experience and achievements to apply88(@)talentvis.com or click apply!
..We regret to inform that only shortlisted candidates would be notified..
EA License No: 04C3537
EA Personnel No: R22106683
EA Personnel Name: Yang Hui Shan, Sherri