Search by job, company or skills

D

HPC High Performance Compute System Administrator (Linux / Cluster)

1-3 Years
SGD 4,000 - 6,000 per month
Save
new job description bg glownew job description bg glownew job description bg svg
  • Posted 3 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

HPC Administrator (Linux / Cluster / System Administrator) - 1-3 Years Experience

Role Overview

We are seeking a High Performance Computing (HPC) Administrator to support the deployment, monitoring, and maintenance of HPC clusters, Linux systems, and IT infrastructure. This role is ideal for candidates with experience in Linux system administration, cluster computing, and data center environments.

Key Responsibilities

  • Support daily operations of HPC clusters (compute, storage, networking)
  • Monitor system performance, job scheduling (Slurm, PBS, LSF), and resource utilization
  • Install, configure, and patch Linux/Unix systems (RHEL, CentOS, Ubuntu)
  • Manage CPU-based servers (bare metal and virtualized environments - VMware/KVM)
  • Perform system monitoring, health checks, troubleshooting, and incident management
  • Assist with user onboarding, access control (LDAP/AD), and environment configuration
  • Support cluster scaling, performance tuning, and HPC optimization
  • Work with networking (TCP/IP, DNS) and storage systems (NAS, SAN, parallel file systems)
  • Maintain technical documentation, SOPs, and runbooks

Required Skills & Experience

  • 1-3 years of experience in Linux System Administration / Infrastructure Support / HPC Operations
  • Exposure to HPC environments, cluster computing, or high-performance systems
  • Strong hands-on experience with:Linux OS (RHEL, CentOS, Ubuntu)Server management (physical servers, virtualization)
  • Understanding of:Networking fundamentals (TCP/IP, DNS, SSH)Storage technologies (NAS, SAN, distributed or parallel file systems)
  • Experience with scripting (Bash, Shell, Python)
  • Strong troubleshooting, problem-solving, and analytical skills

Preferred / Nice-to-Have Skills

  • Experience with GPU computing / GPU clusters (NVIDIA, CUDA)
  • Exposure to cloud platforms (AWS, Azure, GCP - HPC workloads)
  • Familiarity with monitoring tools (Prometheus, Grafana, Nagios, Zabbix)
  • Knowledge of DevOps tools (Ansible, Terraform - basic exposure)

More Info

Job Type:
Industry:
Employment Type:

Job ID: 146053797

Similar Jobs