Search by job, company or skills

Radley James

Site Reliability Engineer

Fresher
new job description bg glownew job description bg glownew job description bg svg
  • Posted 19 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

A technology-driven trading firm operating in global financial markets, built on robust, low-latency systems, disciplined engineering, and a culture that values ownership, precision, and continuous improvement. Technology is at the core of everything they do.

The Role

As a Site Reliability Engineer, you will be responsible for the reliability, performance, and scalability of mission-critical trading systems. You will work closely with software engineers, traders, and infrastructure teams to ensure the platforms operate with high availability and predictable performance in a fast-paced, real-time environment.

This role is hands-on and suited to engineers who enjoy deep technical challenges, production ownership, and building resilient systems at scale.

Responsibilities

  • Design, build, and operate highly reliable and scalable production systems
  • Monitor, troubleshoot, and resolve production issues across trading and market data platforms
  • Improve system observability through metrics, logging, and alerting
  • Automate operational workflows and reduce manual toil
  • Partner with engineering teams to influence system design for reliability and operability
  • Participate in on-call rotations and incident response, including post-incident reviews
  • Continuously improve deployment, capacity planning, and disaster recovery practices

Requirements

  • Strong experience in Site Reliability Engineering, Systems Engineering, or Production Engineering
  • Solid understanding of Linux systems, networking, and distributed systems
  • Proficiency in at least one programming or scripting language (e.g. Python, Go, Java, C++, Bash)
  • Experience with monitoring and observability tools (e.g. Prometheus, Grafana, ELK, Datadog)
  • Familiarity with containerisation and orchestration technologies (e.g. Docker, Kubernetes)
  • Experience operating systems in high-availability, low-latency, or high-throughput environments
  • Ability to debug complex issues under pressure and communicate clearly during incidents

Nice to Have

  • Experience in trading, financial services, or other latency-sensitive environments
  • Knowledge of cloud platforms (AWS, GCP, Azure) and hybrid infrastructure
  • Experience with infrastructure-as-code tools (e.g. Terraform, Ansible)
  • Understanding of TCP/IP performance tuning and kernel-level optimisations

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 137014699

Similar Jobs