Job Summary
We are seeking a hands-on Cloud & Infrastructure Lead to manage and support enterprise infrastructure environments, with a strong focus on Linux systems, container platforms, and virtualization. This role requires both technical depth and team leadership experience in L2/L3 operations.
Responsibilities
- Lead and coordinate day-to-day L2/L3 infrastructure operations to ensure high availability and performance of enterprise systems
- Provide hands-on administration and troubleshooting for Linux-based systems (RHEL/CentOS) to maintain system stability and security
- Administer and support OpenShift and Kubernetes container platforms to enable scalable application deployment and management
- Manage Nutanix and virtualization infrastructure to optimize resource utilization and infrastructure lifecycle
- Implement and maintain automation workflows using Ansible to improve operational efficiency and reduce manual errors
- Monitor system health and performance using the ELK stack (Elasticsearch, Logstash, Kibana) to proactively identify and resolve issues
- Troubleshoot incidents promptly and lead root cause analysis to drive continuous service improvements
- Oversee patching, upgrades, and infrastructure lifecycle activities to maintain compliance and system reliability
- Collaborate with cross-functional internal teams to ensure seamless service delivery and infrastructure alignment with business needs
- Lead and mentor small technical teams or act as a technical lead to foster knowledge sharing and team effectiveness
Required competencies and certifications
- Proven experience in Infrastructure Operations at L2/L3 support levels
- Hands-on expertise in Linux system administration (RHEL/CentOS)
- Experience administering OpenShift or Kubernetes-based container platforms
- Experience managing Nutanix or similar virtualization technologies
- Hands-on experience implementing automation using Ansible
- Experience using ELK stack for system monitoring and observability
- Demonstrated leadership in managing or technically leading small teams
- Strong troubleshooting and incident management skills to maintain system uptime
Preferred competencies and qualifications
- Exposure to cloud platforms such as AWS or Azure
- Experience scripting with Bash or Python to automate tasks and enhance system management
- Familiarity with enterprise monitoring and logging frameworks beyond ELK stack