Search by job, company or skills

Shopee

Senior Backend Engineer - Machine Learning Platform (R&D, CTR/VTR Predictor) - Ego team

2-4 Years
new job description bg glownew job description bg glownew job description bg svg
  • Posted 15 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

About The Team

The EGO Team is dedicated to building an industry-leading Machine Learning (ML) platform that powers the efficient deployment of algorithms across core business sectors, including Recommendation, Search, and Advertising. Our platform focuses on CTR/CVR prediction within large-scale sparse feature scenarios and explores Generative Recommendation (GR) integrated with Large Language Models (LLMs). By deeply optimizing the entire pipelinefrom model training to online inferencewe deliver low-latency, high-throughput, and high-precision inference services for e-commerce, general content, and social media scenarios, serving as a core algorithmic engine for business growth.

The platform covers the full lifecycle of Deep Learning, including sample generation, feature engineering, model training, deployment, online inference, and closed-loop monitoring. We have developed a robust training/inference acceleration framework, complemented by a Web UI and RESTful APIs, aiming to achieve a truly end-to-end, automated, and intelligent machine learning ecosystem.

Job Description

  • Responsible for the R&D and optimization of online inference services for deep learning models in large-scale sparse feature scenarios, supporting high-efficiency inference needs across Shopee's various business lines.
  • Conduct in-depth research into various inference acceleration algorithms to reduce the computational cost of model deployment.
  • Collaborate across the business pipeline to tune the end-to-end online service system, ensuring high availability and stability.
  • Research and implement efficient inference solutions that combine Large Language Models (LLMs) with Search, Ads, and Recommendation (GR).

Requirements

  • Bachelor's degree or above in Computer Science, Electronics, Automation, Software Engineering, or related fields, with at least 2 years of relevant work experience.
  • Expertise in C++ programming with a solid foundation in low-level systems; proficient in multi-threading, lock optimization, memory pools, thread pools, template programming, GDB debugging, performance profiling, and RPC frameworks.
  • Experience in online inference/serving; has developed proprietary inference engines or is highly familiar with engines such as TensorFlow + XLA, TensorRT, Triton, vLLM, or TensorRT-LLM.
  • Deep practical experience in GPU optimization, including operator fusion, graph optimization, CUDA programming, kernel scheduling, Warp execution models, memory access optimization, and VRAM scheduling.
  • Preferred: Candidates who have researched or implemented GR (Generative Recommendation) solutions such as HSTU, HLLM, or OneRec.
  • High passion for computer technology, proactive learning mindset, and a spirit for deep technical dive; maintains high standards for code quality and demonstrates a rigorous, detail-oriented work style.
  • Strong team player with excellent continuous learning capabilities.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 145261949