Joining Razer will place you on a global mission to revolutionize the way the world games. Razer is a place to do great work, offering you the opportunity to make an impact globally while working across a global team located across 5 continents. Razer is also a great place to work, providing you the unique, gamer-centric #LifeAtRazer experience that will put you in an accelerated growth, both personally and professionally.
Job Responsibilities :
We are hiring a Senior / Lead Software Engineer to design and build an AI/ML platform capable of high-throughput training and inference across local and cloud GPU environments. This role focuses on systems architecture, GPU acceleration, performance engineering, and reliable operation of AI workloads at scale. You will lead engineering initiatives, define platform architecture, and collaborate closely with ML and hardware teams.
Responsibilities
System Architecture & Core Engineering
- Design and implement the architecture for model training, fine-tuning, and serving.
- Build platform components that support heterogeneous compute environments (GPUs, NPUs, accelerators).
- Develop and optimize high-performance inference stacks using frameworks such as vLLM, SGLang, TensorRT-LLM, or Triton.
- Develop APIs, CLI tools, and backend services for model lifecycle management.
Local & Cloud GPU Acceleration
- Build and operate GPU clusters in on-prem systems and cloud environments (e.g., AWS/GCP/Azure).
- Optimize GPU memory, compute throughput, PCIe/NVLink utilization, and inter-node communication for distributed inferencing & training.
- Tune CUDA, cuDNN, NCCL, and related libraries for performance and reliability.
- Deploy scalable GPU-based inference systems with attention to latency, throughput, and cost.
Collaboration & Leadership
- Mentor engineers drive design reviews, architecture discussions, and coding standards.
- Work with ML researchers and hardware teams to co-design efficient algorithms and deployment strategies.
- Coordinate engineering deliverables and provide clear technical direction for the team.
Pre-Requisites :
Required Skills & Experience
- 8+ years of experience in software engineering, ML infrastructure, or high-performance computing.
- Hands-on experience designing and operating both local GPU servers and cloud GPU environments.
- Deep knowledge of transformer-based model inference and training optimization.
- Experience building high-availability distributed systems, including failover, replication, and autoscaling patterns.
- Strong proficiency in Python and at least one systems language (C++ or Go).
- Experience with deep learning frameworks (PyTorch, TensorFlow) and inference engines (vLLM, SGLang, TensorRT, Triton).
- Solid understanding of distributed systems, parallel computing, and container orchestration (Docker, Kubernetes).
- Solid understanding of GPU programming (CUDA or ROCm), GPU memory hierarchy, and performance tuning.
- Strong collaboration and communication skills ability to lead engineering efforts.
Preferred Qualifications
- Experience with on-device or edge-accelerated inference.
- Familiarity with cloud-native GPU scheduling and autoscaling systems.
- Experience with model compression, quantization, speculative decoding, or other inference-efficiency techniques.
- Contributions to open-source AI infrastructure projects.
- Master's degree or PhD in Computer Science, Electrical Engineering, or related field.
Razer is proud to be an Equal Opportunity Employer. We believe that diverse teams drive better ideas, better products, and a stronger culture. We are committed to providing an inclusive, respectful, and fair workplace for every employee across all the countries we operate in. We do not discriminate on the basis of race, ethnicity, colour, nationality, ancestry, religion, age, sex, sexual orientation, gender identity or expression, disability, marital status, or any other characteristic protected under local laws. Where needed, we provide reasonable accommodations - including for disability or religious practices - to ensure every team member can perform and contribute at their best.
Are you game