Search by job, company or skills

Black Sesame Technologies (Singapore) Pte Ltd

Research Scientist (End-to-End & Multimodal Models)

Fresher
new job description bg glownew job description bg glownew job description bg svg
  • Posted a day ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Purpose:

In this role, you will be responsible for the end-to-end design and development of autonomous driving frameworks. You will integrate mainstream perception, prediction, and planning technologies into a unified modeling system, leveraging both vision-only and vision-language modeling paradigms, to support autonomous driving tasks across urban and highway scenarios.

You will play a key role in advancing end-to-end and hybrid architectures, including the exploration of Vision-Language Models (VLMs) to enhance scene understanding, reasoning, and decision-making robustness in complex driving environments.

Responsibilities:

  • Lead the design and implementation of end-to-end autonomous driving models, including one-stage (sensor-to-control) and two-stage (e.g., perceptionplanning decoupled) architectures. Define model structures, training pipelines, and optimization strategies for stable and explainable planning outputs.
  • Drive the development of pure vision-based end-to-end systems, integrating multi-task capabilities such as BEV perception, static and dynamic occupancy inference, trajectory prediction, and planning.
  • Explore and apply Vision-Language Models (VLMs) to improve high-level scene understanding, semantic reasoning, and cross-modal representation learning for autonomous driving tasks.
  • Optimize and deploy models on embedded platforms, including inference acceleration, post-processing, system-level integration, performance tuning, stability validation, and on-road testing.
  • Deliver production-ready solutions for elevated highways and urban driving scenarios, enabling scalable deployment and continuous progression toward higher levels of autonomy.

Qualification/ Requirements:

  • Ph.D. degree in Computer Science, Artificial Intelligence, Robotics, or a related field.
  • Strong foundation in autonomous driving systems, with hands-on experience in end-to-end deep learningbased modeling.
  • Practical experience in planning, control, or decision-making modules using deep learning approaches.
  • Experience or strong interest in Vision-Language Models (VLMs), multimodal learning, or cross-modal representation learning, particularly in applications involving visual scene understanding and reasoning.
  • Proficiency in C/C++ and Python, with experience in real-time inference deployment and performance optimization.
  • Familiarity with BEV-based representations, occupancy prediction, and multi-task learning frameworks.
  • Experience with system integration and real-vehicle testing is a strong plus.
  • Strong problem-solving skills, adaptability to complex real-world scenarios, and a results-driven mindset.
  • Strong mathematical foundation in optimization techniques relevant to computer vision and deep learning.

More Info

Job Type:
Industry:
Employment Type:

Job ID: 144533711