Key Responsibilities:
- Design and develop compute cluster configurations optimized for performance, reliability, and scalability.
- Select and validate hardware components including CPUs, memory, storage, networking, and specialized accelerators.
- Collaborate with hardware, software, and systems engineering teams to ensure seamless integration of compute clusters into broader system architectures.
- Document hardware design decisions, integration procedures, and diagnostic workflows for internal and cross-team use.
- Participate in design reviews, integration planning, and collaborative problem-solving sessions with cross-functional teams.
Required Skills & Qualifications:
- Experience in computer hardware design, particularly in compute cluster or server environments.
- Experience in networking design, including InfiniBand, Ethernet switches, with expertise in port mapping and configuration.
- Familiarity with modern memory technologies (e.g., DDR4/DDR5, DIMM, LPDDR, HBM).
- Familiarity with Linux system administration and OS customization (preferably SUSE Linux).
- Understanding of system-level performance tuning and hardware-software interaction.
- Excellent documentation and communication skills.
Attributes:
- Experience with hardware validation and troubleshooting tools.
- Knowledge of high-performance computing (HPC) or distributed systems.
- Ability to work effectively in a collaborative, cross-functional engineering environment.
- Test-driven development mindset and attention to detail.
- Self-starter with a proactive approach to problem-solving and continuous improvement.
Minimum Qualifications
- Doctorate (Academic) Degree and 0 years related work experience Master's Level Degree and related work experience of 3 years Bachelor's Level Degree and related work experience of 5 years