Search by job, company or skills

C

Senior Software Engineer, Site Reliability Engineering

0-5 Years
new job description bg glownew job description bg glownew job description bg svg
  • Posted 3 hours ago
  • Be among the first 30 applicants
Early Applicant
Quick Apply

Job Description

We are a team to design, develop, maintain, and improve software for various ventures projects, i.e., projects that are adjacent to our core businesses and are bootstrapped fast with a lean team. You will be actively involved in the design of various components behind scalable applications, from frontend UI to backend infrastructure.

What you'll be doing

  • Ensure entire stack is healthy: hardware, software, application and network are operating at optimal performance
  • Perform deep dives into both systemic and latent reliability issues; partnering with other software and DevOps engineers across the organization to design, implement and roll out fixes
  • Continuously improve availability, reliability, and observability and reduce the burden of human toil with tooling and automation
  • Lead and drive SRE initiatives to improve operation efficiencies
  • Represent the SRE team in system design reviews and operational readiness exercises for new and existing services

What you need

  • Experience coding in Ruby and/or Go
  • Familiar with GitOps principles and tools (Github Actions, Docker, Kubernetes)
  • Experience in designing, analyzing, and troubleshooting large-scale distributed systems
  • Curiosity about finding root causes in incidents and outages
  • Ability to develop alignment to cultivate relationships and driving impact
  • Mindset in designing fault tolerance system architecture
  • Comfort with being uncomfortable in ambiguous situations
  • Involvement with incident management and response
  • Desire to grow expertise, inform, and educate others
  • Capable to pick up various technologies, a fast learner and have a get things done mentality
  • Humble to embrace better ideas from others, eager to make things better, open to challenges and possibilities

Desirable

  • Familiar with cloud platforms and micro-service based architecture (AWS is big plus)
  • Familiar with monitoring tools (e.g. Datadog, OpenTelemetry)
  • Familiar with CICD tools (e.g. Github Actions)
  • Familiar with IaC tools (e.g. Terraform, Spacelift)
  • Experience in designing resilient system architecture
  • Experience in optimizing performance of large-scale production system

More Info

Job Type:
Function:
Employment Type:

About Company

Job ID: 117339055