Search by job, company or skills

T

Site Reliability Engineer

3-5 Years
SGD 7,000 - 10,000 per month
Save
new job description bg glownew job description bg glownew job description bg svg
  • Posted 22 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Company Overview

TrueWatch is headquartered in Singapore, delivering a unified observability platform that integrates data collection, visualization, and security insights to help teams monitor and understand complex systems with clarity and confidence.

Job Summary

You will maintain and optimize cloud infrastructure reliability and performance, automate deployments, support incident response, and collaborate with engineering teams to enhance system scalability and operational efficiency.

Responsibilities

  • Maintain system reliability, availability, and performance for cloud infrastructure and services to ensure continuous operations
  • Monitor production environments and manage observability tools to track metrics, logs, and alerts for proactive issue detection
  • Support incident response by troubleshooting issues, conducting root cause analysis, and leading post-incident reviews to prevent recurrence
  • Manage and optimize cloud infrastructure on AWS, Azure, or GCP to improve resource utilization and cost efficiency
  • Implement Infrastructure as Code (IaC) and automate deployments through CI/CD pipelines to accelerate delivery and reduce errors
  • Enhance system scalability, resilience, and operational efficiency by identifying and applying improvements
  • Support security best practices by ensuring compliance and coordinating vulnerability remediation efforts
  • Collaborate with engineering and platform teams to drive service reliability improvements and operational excellence
  • Perform system performance tuning, capacity planning, and infrastructure optimization to meet evolving business demands
  • Execute additional platform engineering and operational tasks as assigned to support overall system health

Required competencies and certifications

  • Bachelor's degree in Computer Science, IT, Engineering, or related field
  • 3-5 years of experience in SRE, DevOps, Cloud Infrastructure, or related roles
  • Hands-on experience with cloud platforms such as AWS, Azure, or GCP
  • Familiarity with monitoring and observability tools
  • Experience with CI/CD pipelines, Infrastructure as Code (IaC), and automation tools
  • Knowledge of Linux systems, networking, Docker, and Kubernetes
  • Basic scripting or programming skills (e.g., Python, Bash, or Go)
  • Strong troubleshooting, problem-solving, and incident management skills
  • Effective communication and teamwork skills in fast-paced environments

Preferred competencies and qualifications

  • Relevant cloud certifications (e.g., AWS Certified Solutions Architect, Azure Administrator, Google Cloud Professional) are an added advantage

More Info

Job Type:
Industry:
Employment Type:

Job ID: 147159755