Job Description
Job Overview
We are seeking a skilled Infrastructure Engineer with 4-5 years of hands-on experience to design, implement, and manage our cloud infrastructure and platforms. You will play a pivotal role in ensuring our systems are highly available, scalable, and secure. The ideal candidate is proficient in the RedHat ecosystem, AWS, Kubernetes, and has a strong background in CI/CD pipelines with a DevSecOps mindset.
Key Responsibilities
- Cloud Infrastructure Management:
- Design, deploy, manage, and optimize infrastructure on AWS (EC2, S3, RDS, VPC, IAM, Lambda, etc.).
- Implement infrastructure as code (IaC) using tools like Terraform or AWS CloudFormation.
- Manage and configure RedHat Enterprise Linux (RHEL) systems, ensuring security compliance and performance.
- Containerization & Orchestration:
- Build, deploy, and manage containerized applications using Docker.
- Administer and optimize Kubernetes clusters (EKS, self-managed, or other distributions) for production workloads.
- Implement service meshes, ingress controllers, and cluster auto-scaling.
- CI/CD & Automation:
- Develop, maintain, and optimize CI/CD pipelines using tools like Jenkins, GitLab CI, GitHub Actions, or ArgoCD.
- Automate provisioning, configuration, and deployment processes to improve efficiency and reliability.
- Integrate security scanning and compliance checks into the CI/CD pipeline (DevSecOps).
- Application Security & Compliance (DevSecOps):
- Implement application security best practices across the infrastructure stack (network, compute, identity).
- Utilize secrets management tools (AWS Secrets Manager, HashiCorp Vault).
- Collaborate with the security team to ensure infrastructure meets compliance standards (e.g., SOC2, ISO27001).
- Perform vulnerability management and patch orchestration.
- Monitoring, Logging, & Reliability:
- Implement and manage monitoring, alerting, and logging solutions (Prometheus, Grafana, ELK Stack, CloudWatch, Datadog).
- Participate in on-call rotations and lead incident response, troubleshooting, and root cause analysis.
- Drive initiatives to improve system reliability, performance, and cost-optimization.
- Collaboration & Mentorship:
- Work closely with development teams to enable a true DevOps culture.
- Document architectures, processes, and runbooks.
- Share knowledge and mentor junior team members.
Qualifications
- Bachelors Degree in IT/Telecom, Computer Science.
- 4-5 years of professional experience in infrastructure engineering, cloud operations, or site reliability engineering (SRE).
- Strong hands-on expertise with Amazon Web Services (AWS) core services and best practices.
- Proven experience with RedHat Enterprise Linux (RHEL) administration, security hardening, and troubleshooting.
- Solid experience in building, deploying, and managing Kubernetes clusters in production.
- Deep understanding of CI/CD principles and extensive experience with pipeline tools (Jenkins, GitLab CI, etc.).
- Strong Infrastructure as Code (IaC) skills, preferably with Terraform.
- Experience integrating security tools (SAST, DAST, secret scanning) into CI/CD pipelines.
- Proficient in scripting languages (Bash, Python, or Go).
- Experience with configuration management tools (Ansible preferred).
- Excellent problem-solving skills and a systematic approach to incident management.
- Strong communication and collaboration skills.
Nice-to-Have Skills
- AWS Certification (Solutions Architect, DevOps Engineer, SysOps Administrator).
- RedHat Certification (RHCE, RHCSA).
- Kubernetes Certification (CKA, CKAD).
- Experience with other cloud providers (Azure, GCP).
- Knowledge of service mesh technologies (Istio, Linkerd).
- Experience with GitOps methodologies and tools (FluxCD, ArgoCD).
- Familiarity with Agile/Scrum methodologies.