Description and Requirements
Key Responsibilities:
Service Delivery & Operational Excellence
- Lead and oversee day-to-day operations across all managed services (Cloud, Network, Systems, Security, and Applications).
- Ensure strict adherence to defined SLAs, OLAs, and KPIs, driving accountability and data-driven performance reviews.
- Maintain service continuity and incident response readiness, including after-hours escalation management.
- Conduct post-incident reviews (PIRs), ensuring lessons learned and RCA findings translate into sustainable corrective actions.
- Champion service improvement initiatives (e.g., automation, observability dashboards, and event correlation tuning).
ITIL Governance, Risk & Compliance
- Implement and govern all ITIL process pillars - Incident, Request, Problem, Change, Knowledge, and Asset Management.
- Own the operational governance framework, including SOPs, escalation matrices, RACI charts, and service catalog alignment.
- Lead weekly/monthly governance cadences (SLA review, backlog health, change risk review, vulnerability remediation).
- Ensure compliance with GovTech OCP/ORA policies, ISO 20000/27001, and internal audit controls.
- Identify operational risks and compliance gaps, drive mitigation, and track closure with stakeholders.
- Partner with cybersecurity and compliance teams on access control, privileged account, and break-glass governance.
Stakeholder & Customer Management
- Act as the primary customer interface and trusted advisor, ensuring clear communication and stakeholder satisfaction.
- Conduct monthly and quarterly business reviews (MBR/QBR), presenting operational insights, trends, and risk posture.
- Manage multi-agency or multi-stakeholder relationships, balancing priorities and ensuring unified service outcomes.
- Coordinate with third-party vendors and internal towers (NOC, SOC, CloudOps, Network, DBA) for integrated service management.
- Drive proactive engagement and feedback loops, ensuring alignment with evolving business and compliance requirements.
Operational Support & Coordination
- Oversee Day-2 Operations, including monitoring, patching, backup, and performance optimization.
- Ensure incident prioritization and escalation discipline, enforcing communication templates and real-time updates.
- Manage major incidents and DR scenarios, ensuring structured response, timeline documentation, and customer assurance.
- Govern change control lifecycle, ensuring risk assessment, CAB participation, and proper rollback validation.
- Ensure operational readiness for new deployments (Day-1 to Day-2 transition, hyper-care, acceptance testing).
5. Reporting, Metrics, and Compliance
- Produce monthly performance reports and dashboards detailing SLA attainment, MTTR, backlog aging, and RCA metrics.
- Track and analyze trending data to identify recurrent issues and propose automation or process improvements.
- Present governance scorecards and risk registers to both internal leadership and customer stakeholders.
- Leverage data from ServiceNow / ITSM platforms, CloudWatch, Grafana, and Zabbix for real-time health visibility.
6. People & Resource Management
- Manage and mentor shared pools of engineers (L1-L3) across multiple service domains.
- Oversee workload allocation, shift rosters, and on-call readiness, ensuring 24×7 coverage and staff well-being.
- Lead performance reviews, coaching plans, and skill development, aligning to service needs and ITIL maturity goals.
- Support hiring, onboarding, and succession planning to maintain operational stability.
7. Strategic Service Improvement & Transformation
- Lead service transformation initiatives, aligning operations to cloud-native and automation-first models.
- Collaborate with solution architects and PMO to ensure smooth service transition and continual improvement.
- Drive efficiency gains through SOP refinement, AI-Ops, automation, and proactive monitoring adoption.
- Support leadership in developing innovation roadmaps and service maturity frameworks.
Qualifications:
- Bachelor's degree in Information Technology, Computer Science, or a related discipline.
- 5-8 years of experience in service delivery, operations, or IT service management, with at least 3 years in a cloud or managed services environment.
- Proven experience managing multi-agency or multi-customer clusters with shared operational teams
Technical Expertise:
- Strong understanding of AWS / Azure / GCP environments and their operational frameworks.
- Proficient in ITIL v4 processes and governance ITIL certification preferred.
- Hands-on experience with ITSM tools (e.g., ServiceNow, JIRA, Remedy).
- Familiarity with cloud monitoring platforms (e.g., CloudWatch, Azure Monitor, Zabbix, Grafana).
- Experience in change control, incident RCA, and problem management.
- Working knowledge of compliance standards (ISO27001, ITSM audit, or GovTech OCP/ORA).
Soft Skills:
- Strong leadership and communication skills with the ability to influence cross-functional teams.
- Excellent stakeholder engagement and customer relationship management.
- Analytical mindset with a continuous improvement and risk mitigation focus.
- Ability to manage multiple priorities in a fast-paced environment.





