Search by job, company or skills

K

Senior Operations Support Engineer

5-8 Years
SGD 8,040 - 15,270 per month
new job description bg glownew job description bg glownew job description bg svg
  • Posted 10 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Leadership and Management

  • Lead infrastructure engineering teams to deliver comprehensive managed services for entire IT infrastructure environments.
  • Direct desktop engineering teams to provide first-level support and technical problem resolution for end-user communities.


Strategic Operations

  • Oversee and direct daily IT infrastructure operations, ensuring reliable and secure system, service, and application performance.
  • Monitor and manage incident response for business-critical systems with focus on timely resolution to prevent operational delays and service outages.


Organisational Engagement

  • Demonstrate capability to engage effectively with organisational management whilst establishing guidelines, policies, and procedures with strong execution oversight.
  • Manage multiple concurrent deadlines as a self-directed professional with appropriate prioritisation skills.


Operational Excellence

  • Monitor and respond to data centre issues and incidents whilst performing routine operational checks on servers, network devices, storage, and environmental systems.
  • Track IT asset inventory ensuring comprehensive equipment accountability and end-of-life management.


Incident and Change Management

  • Respond promptly to system alerts, alarms, and incidents with appropriate escalation to support teams following defined procedures.
  • Support incident troubleshooting and recovery activities whilst managing planned maintenance, change requests, and scheduled outages.
  • Coordinate hardware installation, replacement, and decommissioning activities alongside media handling and secure storage management.


Infrastructure Management

  • Design, build, and maintain critical cloud infrastructure platforms encompassing compute, storage, networking, containerisation, virtualisation, DNS, monitoring, and supporting systems across development, staging, and production environments.
  • Monitor and manage comprehensive cloud services including CloudWatch logs, alarms, synthetic monitoring, and integrated third-party solutions.


Monitoring and Observability

  • Implement and maintain robust monitoring and observability frameworks for all platform components utilising modern tooling including AWS CloudWatch Canaries, StackOps, Prometheus, Grafana, and ELK stack implementations.
  • Establish comprehensive observability practices to support proactive problem diagnosis and provide actionable insights into system health and performance metrics.


Compliance and Security

  • Maintain adherence to Whole-of-Government platform standards, compliance frameworks, and security requirements through continuous monitoring using government-approved security and monitoring solutions.
  • Implement security controls including access management, security hardening, and compliance monitoring with tools such as CyberArk.


Automation and Infrastructure as Code

  • Develop and maintain infrastructure using Infrastructure as Code (IaC) methodologies with tools including Terraform, Ansible, and AWS CloudFormation to ensure repeatable, automated, and version-controlled deployments.
  • Follow platform standards whilst executing infrastructure automation and modern operational practices to enhance efficiency and reliability.


Site Reliability Engineering

  • Identify and eliminate repetitive operational tasks to improve Developer and Infrastructure Engineer efficiency whilst enhancing overall system reliability through systematic toil elimination and error budget management.
  • Define, track, and report on SRE metrics including Service Level Objectives (SLO), Service Level Indicators (SLI), and error budgets.


Platform Operations

  • Manage virtualisation platforms including VMware vSphere and Hyper-V, encompassing capacity monitoring, performance optimisation, and lifecycle management.
  • Administer AWS Cloud services including EC2, ECS, S3, RDS (PostgreSQL and MS SQL), Docker/Kubernetes, Lambda, CloudFormation, CloudWatch, IAM, and VPC configurations alongside physical server infrastructure.


Network and System Administration

  • Demonstrate proficiency with local networking technologies including TCP/IP, DNS, DHCP, VPN configurations, and routing protocols.
  • Execute comprehensive platform patching strategies leveraging automation to maintain security and stability whilst minimising service disruption.


Business Continuity

  • Maintain backup, disaster recovery, and high availability solutions for critical platform components including AWS Fault Injection Simulator (FIS) testing and multi-availability zone configurations.
  • Support containerisation initiatives and maintain container orchestration platforms for traditional workloads.


Collaboration and Documentation

  • Collaborate effectively with application teams to support platform stability, performance, and scalability requirements.
  • Create and maintain comprehensive platform documentation, operational runbooks, and standard operating procedures.
  • Support team development through knowledge sharing and mentoring on platform operations and modern infrastructure practices.


TECHNICAL SKILLS

  • Advanced experience with enterprise virtualisation platforms (VMware vSphere, Hyper-V)
  • Proficiency in Linux and Windows Server administration
  • Expertise in server monitoring tool installation and regular patching of virtual and physical servers
  • Comprehensive health check capabilities for servers, storage, and virtualisation platforms
  • Strong experience with infrastructure automation tools (Ansible, Puppet, Chef)
  • Proficiency with container technologies (ECS, Docker, Kubernetes)
  • Experience with monitoring and observability platforms
  • Infrastructure as Code expertise (Terraform, AWS CloudFormation, Ansible)
  • Solid understanding of networking concepts and technologies
  • Scripting capabilities in Python, PowerShell, Bash, and Node.js
  • Experience with high-availability and disaster recovery solutions including AWS FIS
  • Proficiency with GitHub tools and CI/CD pipeline setup and workflow management


Professional Qualifications

  • Bachelor's degree in computer science, Information Technology, or related technical discipline with demonstrated experience in infrastructure operations and engineering.
  • Strong understanding of enterprise infrastructure components with proven experience supporting infrastructure modernisation initiatives.
  • Excellent analytical and problem-solving capabilities with strong documentation skills and effective communication abilities for both technical and non-technical stakeholders.


Desired Certifications

  • VMware Certified Professional (VCP) or Windows vSphere
  • Microsoft Certified: Windows Server
  • Red Hat Certified Engineer (RHCE)
  • AWS Certified Solutions Architect or AWS Certified SysOps Administrator
  • Additional certifications in networking, security, or government IT standards.
  • Previous experiences in government or highly regulated environments are strongly preferred.

More Info

Job Type:
Industry:
Employment Type:

Job ID: 143841685