Search by job, company or skills

R

Senior Site Reliability Engineer

5-7 Years
new job description bg glownew job description bg glownew job description bg svg
  • Posted 23 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Joining Razer will place you on a global mission to revolutionize the way the world games. Razer is a place to do great work, offering you the opportunity to make an impact globally while working across a global team located across 5 continents. Razer is also a great place to work, providing you the unique, gamer-centric #LifeAtRazer experience that will put you in an accelerated growth, both personally and professionally.

Job Responsibilities :

We are looking for Senior Site Reliability Engineers (SRE) to join our AI Software team. In this role, you will ensure the reliability, performance, scalability, and operational excellence of AIproducts, model-serving infrastructure, and backend API systems.You'llwork closely with software engineers, AIteamsand release teams to automate operations, enhance observability, and streamline deployments in a cloud-scale environment.This role is ideal for someone who enjoys building resilient systems, solving complex infrastructure problems, and supporting AI workloads in production.

EssentialDuties and Responsibilities

  • Administer,monitor, and manage cloud-scale production environments for AI model APIs, backend services, and high-traffic web systems serving global users.

  • Design and implement fault-tolerant, autoscaling cloud architectures tailored for AI inference workloads, including GPU-based environmentsand software products.

  • Build automated self-recovery systems to ensure high availability, rapid failover, and cost-efficient resource usagefor all software products.

  • Manage andmonitorAI model-serving platforms, inference engines, vector databases,data pipelines, software applications

  • Ensure reliability and uptime for experimental,productionAIsoftware environments.

  • Implement andmaintaincomprehensive monitoring, logging, and alerting for all AI and backend services.

  • Reduce MTTR through actionable alerts, runbooks, and automated diagnostics.

  • Automate infrastructure usingIaC(Terraform/CloudFormation) and configuration management.

  • Improve release workflows and integrate with QA for smooth handoff to Release Candidate testing.

  • Work closely with software engineering, ML engineering, and release management to enhance operational procedures, deployment processes, and incident response workflows.

  • Participate in on-call rotations, incident reviews, and continuous improvementinitiatives..

Pre-Requisites :

Qualifications

  • 5+ years of relevant experienceinSRE, DevOps, infrastructure engineering, or cloud operations

  • Experience operating production services with significant availability or scaling demands.

  • Strong knowledgeinWeb Technologies such as HTTP, REST, SSL, Load Balancers, Web Proxies (NGINX)

  • Comfortable with Linux and Docker administration

  • Basic knowledge in AWS, CI/CD (Jenkins),IaC(Terraform), Container Orchestration (AWS ECS or K8s), Version Control (Git), Database (mySQL,noSQL)

  • Strong ability to code and script( preferablyBash scripting and Python)

  • Ability to use or quickly pick up a wide variety ofopen sourcetechnologies and automation tools

  • Understanding ofGPU-based workloads and resource scheduling.

  • Familiarity with vector databases, embeddings, and inference pipeline

  • Comfort with frequent, incremental code testing and deployment

  • Must have good analytical skills to debug deployment problems without taking help from developers

  • Deep hands-on technicalexpertiseand problem-solving skills

  • Ability to work in a collaborative, technically challenging environment with rapidly changing requirements.

Education & Experience

  • Has aBachelor's or Master'sdegreein computer science,AIor similar disciplinefrom an accredited institution

Travel Requirements

  • Role based inSingapore officeand may require up to 1 travel trip per year.

Razer is proud to be an Equal Opportunity Employer. We believe that diverse teams drive better ideas, better products, and a stronger culture. We are committed to providing an inclusive, respectful, and fair workplace for every employee across all the countries we operate in. We do not discriminate on the basis of race, ethnicity, colour, nationality, ancestry, religion, age, sex, sexual orientation, gender identity or expression, disability, marital status, or any other characteristic protected under local laws. Where needed, we provide reasonable accommodations - including for disability or religious practices - to ensure every team member can perform and contribute at their best.

Are you game

More Info

About Company

Razer Inc., is an American-Singaporean multinational technology company that designs, develops, and sells consumer electronics, financial services, and gaming hardware. Founded by Min-Liang Tan and Robert Krakoff, it is dual headquartered in one-north, Singapore and Irvine, California, US

Job ID: 145228857

Similar Jobs