Search by job, company or skills

S

Principal Engineer, AI

6-10 Years
SGD 8,500 - 11,000 per month
Save
new job description bg glownew job description bg glownew job description bg svg
  • Posted 5 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

1. Role Purpose

The Automation Manager is responsible for designing, implementing, and managing automation capabilities across network and datacenter infrastructure domains, with a focus on execution, integration, monitoring, and operational efficiency.

The role will enable StarHub's transformation towards automation-first operations, AI-enabled infrastructure management, and optimized data center utilization, including energy and space optimization.

2. Key Responsibilities

a) IP & Broadband(Core + Access + Transmission)

  • Design and implement automation frameworks across:
    1. IP Core (BNG, routing)
    2. Broadband Access (OLT/ONT)
    3. Transmission and transport networks
  • Enable automated provisioning, configuration management, and lifecycle operations.
  • Develop standard APIs and integration interfaces for network actions.
  • Implement configuration compliance and drift management mechanisms.
  • Support high availability and resilience readiness (failover, rerouting support).
  • Integrate network telemetry into centralized platforms AI-driven diagnostics and closed-loop automation workflows.
  • Ensure all automation workflows are secure, auditable,and compliant.

b) Data Centre Operations (Infrastructure Automation + Energy & Space Optimization)

Infrastructure Automation

  • Automate compute, storage, and network provisioning across data center environments.
  • Develop runbook automation for operational tasks (restart, failover, scaling).
  • Automate patching, upgrades, and lifecycle management.
  • Enable infrastructure data pipelines and integrations to support AI-based anomaly detection and predictive maintenance systems.
  • Enable event-driven execution workflows from monitoring systems.
  • Maintain automation pipelines (CI/CD) for infrastructure operations.
  • Ensure robust execution frameworks with rollback and validation mechanisms.

Energy & Space Optimization

  • Enable centralized monitoring of:
    1. Power consumption (UPS, PDU)
    2. Cooling systems (HVAC, CRAC)
    3. Environmental metrics (temperature, airflow, humidity)
  • Support visibility into:
    1. Rack utilization, whitespace, and capacity headroom
  • Enable automation readiness for:
    1. Cooling optimization and airflow balancing
    2. Power utilization tracking (PUE and efficiency metrics)
  • Support capacity planning across space, power, cooling, and network layers.
  • Enable execution of optimization actions to improve energy efficiency and reduce costs.
  • Support data collection and execution readiness for AI-driven energy optimization (cooling efficiency, power balancing).
  • Provide data support for sustainability and energy reporting initiatives.

c) Business Innovation& Strategic Projects

  • Embed automation-first principles intotransformation programs (e.g., iBNG, SiX AntiDDoS, IP-Optical SRv6 Network etc).
  • Enable zero-touch provisioning (ZTP) for new deployments.
  • Develop API-driven and programmable interfaces for new systems.
  • Standardize integration patterns across legacy and next-generation platforms.
  • Support end-to-end lifecycle automation (deploy, upgrade, decommission).
  • Ensure all new platforms are automation-ready from Day 1.
  • Provide execution-layer support to orchestration systems (without owning orchestration logic).

d) Monitoring, Visibility & Dashboard Enablement

  • Implement centralized monitoringframeworks across network and data center domains.
  • Enable real-time visibility of performance, utilization, and environmental metrics.
  • Integrate telemetry into DCIM and monitoring platforms.
  • Support development of:
    1. Operational dashboards (NOC / CXOps)
    2. Executive dashboards (capacity, utilization, risk)
  • Enable alarm ingestion and visualization (without owning RCA logic).
  • Ensure data consistency and accuracy for reporting and governance.
  • Support multi-site visibility across data center environments.
  • Enable dashboards that incorporate AI-driven insights (e.g., anomaly indicators, predictive alerts)
  • Ensure telemetry pipelines support AI/ML consumption (real-time, structured, high-quality data feeds).

e) Automation of Operations

  • Develop automation for:
    1. Provisioning and configuration management
    2. Infrastructure and network lifecycle operations
    3. Runbook automation for repetitive tasks
  • Enable execution of:
    1. Network actions (configuration updates, resets)
    2. Infrastructure actions (restart, scaling)
  • Provide secure, standardized APIs for execution.
  • Ensure workflows/MOPs include:
    1. Validation, rollback, retry mechanisms
    2. Comprehensive logging and audit trails
  • Maintain high reliability and performance of execution pipelines
  • Ensure all actions are controlled, pre-defined, and compliant with governance policies
  • Ensure all automation interfaces are API-driven and consumable by AI orchestration platforms

f) Optimization Execution Support

  • Execute approved optimization actions across:
    1. Network infrastructure
    2. Data center systems
  • Support implementation of:
    1. Energy optimization initiatives (cooling, power efficiency)
    2. Space optimization (rack consolidation, capacity balancing)
  • Ensure execution is:
    1. Controlled, non-disruptive, and auditable
  • Support post-implementation validation and verification of outcomes
  • Provide execution readiness for closed-loop automation systems (without decision ownership)

g) Cross-Domain and Cross-Functional Responsibilities

  • Define and enforce automation standards, frameworks,and best practices
  • Standardize API governance, data models, and integration protocols
  • Ensure scalability across multi-site environments (including large-scale data center deployments)
  • Maintain reliability, performance, and resilience of automation platforms
  • Ensure security, auditability, and compliance of all automation workflows
  • Drive improvements in automation maturity and operational efficiency

3. Qualifications & Experience

  • Bachelor's Degree in Engineering, Computer Science, orrelated field
  • 6-10 years of experience in
    1. Network or data center operations
    2. Automation / system integration roles

Technical Skills:

  • Strong understanding of:
    1. IP networking and broadband architecture
    2. Data center infrastructure (power, cooling, monitoring)
  • Experience with:
    1. Automation tools (Python, Ansible, Scripting
    2. API integrations and system orchestration
    3. Monitoring and observability platforms
    4. AI/ML-enabled operations (AIOps) concepts and data pipelines

Preferred Skills:

  • Exposure to:
    1. DCIM tools
    2. Cloud environments (AWS / GCP / Azure)
    3. TR-069 / TR-369 (device management)
  • Familiarity with data platforms (e.g., telemetry systems, data lakes)
  • Exposure to AI-driven operations platforms (AIOps / observability / closed-loop systems)

More Info

Job Type:
Industry:
Employment Type:

Job ID: 147051663

Similar Jobs