Lead applied research and strategic definition of machine learning algorithms, quantization methodologies, and toolchain capabilities for the Neural Network Development Kit (NDK) roadmap targeting next-generation Edge AI compute solutions
Drive innovation at the intersection of ML algorithms and constrained hardware environments, identifying and validating the latest Edge AI technologies applicable to product requirements
Serve as the primary technical interface between the AI Architecture team and the Software (Edge AI) team, delivering well-researched toolchain feature proposals and algorithmic specifications for implementation
Collaborate with Software (Edge AI) and AI Architecture teams to identify and pursue targeted improvements in ML software methodology, and support Software-initiated improvement efforts with algorithmic insight and implementation guidance
Maintain deep engagement with the global Edge AI research community to ensure the NDK roadmap reflects the state of the art in model efficiency, compression, and on-device learning
Key Responsibilities:
Collaborate with the Sr. Design Manager (AI Architect) to define and maintain the NDK toolchain feature roadmap, ensuring alignment with the NPU hardware roadmap and overall AI product strategy
Research, evaluate, and recommend quantization algorithms, pruning strategies, knowledge distillation techniques, and other model compression methodologies suited to constrained hardware targets
Assemble and lead focused task forces drawing on partial bandwidth from the Software (Edge AI) and IC Design teams to prototype, benchmark, and validate proposed toolchain concepts before broader commitment
Prototype and benchmark candidate ML algorithms and toolchain features to quantitatively demonstrate accuracy-performance trade-offs and justify roadmap prioritization
Translate hardware architectural capabilities and constraints (as defined by the NPU Architect) into concrete toolchain feature requirements and algorithmic optimization opportunities
Deliver comprehensive technical specifications and algorithmic documentation to the Software (Edge AI) team to enable confident and accurate implementation of NDK features
Collaborate closely with the Software (Edge AI) team throughout the implementation phase to resolve algorithmic questions, validate correctness of implementations, and ensure performance targets are met
Actively monitor and synthesize developments from the Edge AI research community - including publications, open-source frameworks, and industry benchmarks - to continuously inform and refresh the NDK roadmap
Partner with the Software (Edge AI) team to jointly identify ML toolchain methodology improvement opportunities and drive those that originate from the AI Architecture team provide expert advisory support for methodology improvements initiated by the Software team
Evaluate and apply a range of productivity tools and techniques - including but not limited to AI-assisted methods - to accelerate algorithmic prototyping, benchmarking, and specification productivity
Evaluate and integrate relevant open-source ML frameworks, runtimes, and toolchain components (e.g., MLIR, TVM, ONNX Runtime) as acceleration vectors for NDK development
Requirements:
Bachelor's or Master's degree in Computer Science, Electrical Engineering, Computer Engineering, or related technical field PhD preferred, particularly in machine learning, optimization, or computer architecture
8+ years of experience in applied machine learning engineering or Edge AI software, with at least 5 years focused on model optimization, ML compilers, or on-device inference toolchain development
Proven expertise in quantization (PTQ, QAT), pruning, knowledge distillation, and other model compression techniques with demonstrated results on resource-constrained hardware
Strong knowledge of AI/ML algorithms, neural network architectures (CNNs, RNNs, Transformers, etc.), and the trade-offs between model accuracy, computational complexity, and memory footprint
Demonstrated ability to stay at the forefront of the Edge AI research community, with a track record of translating academic and industry advances into practical product roadmap contributions
Hands-on experience with mainstream ML frameworks (PyTorch, TensorFlow/Lite) and familiarity with ML compiler stacks such as MLIR, TVM, or ONNX Runtime
Experience consuming hardware architectural specifications and translating them into software toolchain requirements and algorithmic optimizations
Excellent communication skills with ability to present complex research findings and toolchain proposals clearly to architecture, software, and executive audiences
Strong analytical and problem-solving abilities with emphasis on quantitative benchmarking, accuracy-efficiency trade-off analysis, and performance profiling on target hardware
Demonstratedability to work collaboratively across team boundaries, including assembling and coordinating cross-functional task forces without direct authority
Familiarity with RISC-V ISA and its software ecosystem, particularly in the context of AI inference deployment
Experience with FPGA-based or simulator-based prototyping to validate algorithmic concepts against pre-silicon hardware models (preferred but not required)
Self-motivated with ability to work independently, lead applied research initiatives, and drive toolchain innovation from algorithmic exploration through specification and successful team handoff