Key Responsibilities:
- Lead the Day 2 infrastructure team to ensure continuous, stable, and secure operations of all servers, network devices, storage, and on-premise systems.
- Plan, implement, and monitor IT infrastructure changes, upgrades, and capacity expansions to meet project requirements.
- Develop and maintain infrastructure architecture documentation, diagrams, and SOPs for deployment, maintenance, and incident response.
- Collaborate closely with System Engineers and Software Engineers to support middleware, application deployment, and integration interfaces.
- Manage incident resolution for infrastructure-related issues, including escalation to vendors or higher-tier support when necessary.
- Ensure compliance with government IT security policies, InfoSec guidelines, and operational standards.
- Plan and coordinate infrastructure maintenance schedules, patching, and system backups with minimal disruption to services.
- Provide weekly reports on infrastructure status, incidents, resource usage, and recommendations for improvement.
- Mentor and guide infrastructure team members to enhance skills, productivity, and knowledge sharing.
- Participate in business continuity and disaster recovery planning and execution.
- Qualifications and Requirements:
- Minimum 7-10 years of relevant IT infrastructure experience, preferably in government or regulated environments.
- Strong expertise in servers, network, storage, virtualization, cloud (AWS/Azure), and container orchestration (Docker/Kubernetes).
- Experience with monitoring tools (e.g., Grafana, Nagios), automation (Ansible, Terraform), and ITIL processes.
- Proven ability to lead a team, manage resources, and ensure 24/7 operational support.
- Excellent communication skills, capable of presenting technical updates to both technical and non-technical stakeholders.
- Strong analytical and problem-solving skills, with a proactive approach to risk mitigation.
- Familiarity with security operations, InfoSec standards, and government compliance requirements.
Preferred Skills:
- Experience in middleware platforms (WebSphere, Tomcat, or similar) and application deployment pipelines.
- Knowledge of containerized infrastructure and CI/CD integration.
- Experience in coordinating with multiple vendors and service providers.
Experience in install and configure the following COTS:
- Confluent Kafka
- Solace MQTT
- Nginx
- Apache HTTPD
- MongoDB
- MSSQL
- IBM MQ