Search by job, company or skills

ANTAS PTE. LTD.

Site Reliability Engineer (Linux Kernel, Kubernetes, Cloud, Automation, Networking).

Early Applicant
  • Posted 12 days ago
  • Be among the first 10 applicants
10-12 Years
SGD 10,000 - 11,500 per month

Job Description

Responsibilities

  • Develop and oversee performance-critical infrastructure for financial markets, ensuring maximum throughput, high resiliency, and minimal operational risk.
  • Leverage deep Linux kernel expertise to fine-tune scheduling policies, interrupt routing, and NUMA resource allocation, ensuring predictable performance at scale.
  • Build and maintain high-availability containerized environments using Kubernetes, Docker, and advanced orchestration tools with a strong focus on scalability and security.
  • Lead automation initiatives with Ansible, Bash, and Python, eliminating manual intervention and improving system efficiency.
  • Manage hybrid cloud infrastructure (AWS, Azure, GCP) with strict performance SLAs, security compliance, and cost-optimized deployments.
  • Oversee infrastructure monitoring and observability using ELK Stack, Grafana, Site24x7, Splunk, and other enterprise-grade tools, ensuring proactive incident detection and resolution.
  • Administer and troubleshoot enterprise storage and networking stacks like RAID, NFS, SAN/NAS, TCP/IP networking, VMware/vCenter, BigIP load balancers.
  • Collaborate with development, DevOps, and security teams to design fault-tolerant systems and enforce infrastructure governance policies.
  • Execute predictive capacity modeling, OS hardening and patch compliance, coupled with benchmark-driven performance optimization for trading and real-time compute platforms.
  • Provide expert-level outage resolution, coordinating cross-functional teams to deliver sustainable remediation and operational resilience.

Requirements

  • 10+ years of progressive experience in system administration, performance engineering, and reliability operations across enterprise and financial domains.
  • Advanced proficiency in Linux internals with specialization in kernel performance tuning, NUMA-aware optimizations, and real-time workload handling.
  • Proven hands-on experience with Kubernetes,Docker, and Ansible for large-scale automation and orchestration.
  • Strong scripting/programming in Bash, Python, and experience with perf/eBPF for system analysis.
  • Demonstrated expertise in cloud operations across AWS, Azure, and GCP.
  • Strong background in networking protocols (TCP/IP, FIX) and high-performance trading environments.
  • Familiarity with storage systems (SAN, NAS, RAID) and database tuning (MySQL optimization).
  • Experience implementing observability and monitoring solutions like ELK, Grafana, Splunk, Corvil.

More Info

Industry:Other

Function:Financial Markets

Job Type:Permanent Job

Date Posted: 18/09/2025

Job ID: 126224243

Report Job
View More
Last Updated: 28-09-2025 07:55:17 PM
Home Jobs in Singapore Site Reliability Engineer (Linux Kernel, Kubernetes, Cloud, Automation, Networking).