Search by job, company or skills

Nava

Lead Network Architect – AI Infrastructure (GPU Clusters)

Fresher
Save
new job description bg glownew job description bg glow
  • Posted 10 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Nava is building the world's first Silicon-to-Intent autonomous hyperscaler. Following our $22M Series A, we are aggressively scaling our high-density GPU clusters across Mumbai, Hyderabad, and Singapore. We are looking for a world-class Network Architect to design and implement the end-to-end fabric that powers our large-scale NVIDIA H100 and B200 clusters. In this role, you won't just be managing a network—you will be re-engineering how compute and fabric intersect to eliminate bottlenecks in distributed training and inference.

THE ROLE

  • Architect & Design: Lead the architectural design of E2E non-blocking networking for multi thousand GPU clusters.
  • Fabric Orchestration: Deploy and optimize NVIDIA Quantum-2 InfiniBand (NDR) and NVIDIA Spectrum-4 (Spectrum-X) Ethernet fabrics to support multi-rail, rail-optimized topologies.
  • DPU Integration: Architect offloading strategies using BlueField-3 DPUs to handle security, telemetry, and storage acceleration, ensuring zero-trust hardware-native isolation.
  • Performance Tuning: Fine-tune NCCL/UCX collectives and congestion control mechanisms (Adaptive Routing, SHARP) to maximize MFU (Model Flops Utilization).
  • Infrastructure as Code: Automate the lifecycle of the network fabric in a software-defined, autonomous cloud environment.

Technical Requirements

  • NVIDIA Networking Stack: Expert-level experience with NVIDIA Quantum-2 InfiniBand (NDR) switches and NVIDIA Spectrum-4 (Spectrum-X) high-performance Ethernet.
  • Deep DPU Knowledge: Hands-on experience with NVIDIA BlueField (DOCA) for network and security offloading.
  • Protocol Mastery: Expertise in RDMA / RoCE v2, BGP, EVPN-VXLAN, and sophisticated congestion control algorithms.
  • Scale Experience: Proven track record of building and operating CLOS/Leaf-Spine architectures at a scale of 512+ GPUs.
  • Security: Understanding of hardware-native security, including line-rate encryption and zero trust micro-segmentation.

Preferred Qualifications

  • Experience with liquid-cooled high-density rack networking.
  • Contributions to open-source networking projects or OCP (Open Compute Project).
  • Familiarity with the financial modeling of TCO for large-scale hardware deployments.

WHY NAVA

We are a lean, elite engineering team moving at terminal velocity. You will have the autonomy to choose the best-in-class gear and the runway ($22M Series A) to build a sovereign, autonomous AI cloud from the ground up

Skills: design,spectrum,cloud,nvidia,networking,security,building,infiniband,infrastructure,ethernet,density

More Info

Job Type:
Industry:
Function:
Employment Type:

About Company

Job ID: 148334821