Provide technical support and troubleshooting for cloud-based infrastructure and services, including compute, storage, networking, and security components
Collaborate with application, security, and other internal teams to resolve complex issues related to cloud-based services and infrastructure
Monitor and maintain the health, performance, and security of cloud-based services, identifying and addressing potential issues proactively
Drive continuous improvement and optimization of cloud infrastructure and services through automaton where possible, ensuring high availability, performance, and cost efficiency
Implement and manage cloud security controls, compliance requirements, and best practices to protect cloud resources and data
Develop and maintain documentation, best practices, and standard operating procedures for cloud-based infrastructure and services
Participate in the planning, implementation, and optimization of cloud-based solutions, ensuring alignment with business requirements and industry best practices
Stay current with industry trends, emerging technologies, and best practices in cloud architecture, deployment, and operations
Requirements
Bachelor's degree in computer science, Engineering, or a related field or equivalent work experience
Proven experience as a Cloud Engineer or similar role, with a deep understanding of cloud platforms and infrastructure as code (IaC) principles and best practices
Experience with cloud platforms e.g., AWS, Azure, GCP. Familiarity with Government Commercial Cloud (GCC) is a strong advantage
Deep understanding of observability principles and best practices - strong knowledge on AppDynamics, Dynatrace, APM tools & Open Telemetry
Proficiency with observability tools such as Prometheus, Grafana, ELK Stack, Datadog, New Relic, etc
Experience with containerization and orchestration technologies (e.g., Docker, Kubernetes)
Experience with scripting and automation languages (e.g., Python, Bash, etc.)
Knowledge of Site Reliability Engineering (SRE) principles and practices
Experience with Infrastructure as Code (IaC) tools (e.g., Terraform, Ansible)Understanding of cloud security, compliance, and best practices for protecting cloud resources and data
Excellent communication and collaboration skills, with the ability to work effectively in cross-functional teams and communicate technical concepts to non-technical stakeholders