The Platform Operations Engineer will be responsible for managing and maintaining critical platform leveraging on infrastructure-as-code (IaC) with a specific focus on SHIP-HATS environments and DevOps practices. This role requires expertise in modern cloud-native technologies, automation, and operational excellence to ensure high availability, scalability, and security of government digital services.
Key Responsibilities
Infrastructure Management
- Design, deploy, and maintain scalable platform infrastructure using cloud-native technologies and infrastructure as code principles
- Manage containerised environments using Kubernetes and Docker orchestration platforms
- Implement and maintain monitoring, logging, and alerting systems to ensure optimal platform performance and proactive issue resolution
SHIP-HATS Operations
- Operate and maintain applications and services within the SHIP-HATS ecosystem, ensuring compliance with government security standards and operational requirements
- Configure and manage CI/CD pipelines using SHIP-HATS approved tools and methodologies
- Collaborate with development teams to optimise deployment processes and ensure seamless integration within SHIP-HATS services
DevOps Implementation
- Implement and maintain automated deployment pipelines, configuration management, and infrastructure provisioning processes
- Develop and maintain infrastructure as code using tools such as Terraform, Ansible, or similar technologies
- Establish automation and enforce best practices for version control, testing, and deployment
System Reliability and Performance
- Monitor system performance, capacity planning, and resource optimisation to ensure efficient platform operations
- Implement disaster recovery procedures and business continuity planning for critical platform services
- Conduct regular system assessments and implement security hardening measures in accordance with government cybersecurity frameworks
Collaboration and Support
- Work closely with development teams to understand application requirements and provide platform support for deployment and scaling needs
- Participate in incident response and troubleshooting activities, providing technical expertise for complex platform issues
- Document operational procedures, system architectures, and troubleshooting guides for knowledge sharing and compliance purposes
Required Qualifications
Technical Skills
- Polytechnic diploma and/or Bachelors degree in Computer Science, Information Technology, Engineering, or related field, or equivalent professional experience
- Minimum 3 years of experience in platform operations, DevOps, or similar technical roles
- Strong proficiency in Linux/Unix system administration and command-line operations
- Experience with containerisation technologies including Docker and Kubernetes orchestration platforms
Cloud and Infrastructure
- Hands-on experience with cloud platforms such as AWS, Azure, or Google Cloud Platform
- Proficiency in infrastructure as code tools such as Terraform, CloudFormation, or Ansible
- Experience with CI/CD tools and pipelines automation, preferably within government or regulated environments
Programming and Scripting
- Proficiency in scripting languages such as Python, Bash, or PowerShell for automation and operational tasks
- Solid understanding of software development lifecycle and agile methodologies
- Experience with version control systems, particularly Git, and collaborative development workflows
Preferred Qualifications
- Previous experience working with Singapore Government SHIP-HATS platform or similar government DevSecOps environments
- Knowledge of cybersecurity frameworks and compliance requirements relevant to government operations
- Experience with service mesh technologies, API gateways, and microservices architectures
- Relevant industry certifications such as AWS Certified Solutions Architect, Certified Kubernetes Administrator, or similar credentials
Personal Attributes
- Strong analytical and problem-solving skills with attention to detail and systematic approach to troubleshooting
- Excellent communication skills and ability to work collaboratively in cross-functional teams
- Adaptability to rapidly changing technology landscapes and willingness to learn new tools and methodologies
- Strong sense of ownership and accountability for platform reliability and performance