About the Role
We are seeking Principal / Senior Engineers to drive innovation in high-efficiency AI computing. In this role, you will bridge algorithms and hardware design, advancing low-precision and sparse computing to improve the performance and energy efficiency of next-generation AI accelerators.
What You'll Do
- Research and develop advanced quantization techniques for low-precision computing
- Design and optimize high-performance kernels (e.g., GEMM, FlashAttention)
- Collaborate on hardware-software co-design to improve system performance and efficiency
- Partner with IC design teams on specifications, benchmarking, and patent development
Minimum Qualifications
- Ph.D. in Computer Science, Electronic Engineering, or related field
- Strong knowledge of GPU/NPU architecture
- Expertise in low-precision quantization and sparsity
Preferred Qualifications
- Experience with LLM or multimodal model inference optimization
- Hands-on kernel development experience
- Background in AI accelerator system design
- Publications in top conferences (e.g., ISCA, MICRO, HPCA, ASPLOS, NeurIPS, CVPR)
- Strong cross-team collaboration skills