Search by job, company or skills

A

System Engineer (L2/L3)

7-10 Years
SGD 10,000 - 13,000 per month
new job description bg glownew job description bg glownew job description bg svg
  • Posted 5 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

The Managed Services Cross Technology Engineer (L3/4) -System advanced engineering role, architecting level, responsible for ensuringa managed service is provided to all clients, to ensure that their ITinfrastructure and systems remain operational. This is done through proactivelymanaging, overseeing, monitoring, investigating, and resolving escalated technicalincidents and problems to ensure the restoration of these services to theclients.

The primary objective of this role is to understand, defineand implement the technical architecture and strategy for the organization'score platforms and or cloud infrastructure. They act as the deepest technicalexpert, building complex, highly resilient systems while leading the resolutionof systemic failures and setting the standard for IaC and automation bestpractices. Serves as the highest escalation point for all complex incidents orsituations. Also proactively review all client requests or tickets and applytechnical process knowledge to provide the clients with almost immediateresolution without breaching service level agreement (SLA).

The Managed Services Cross Technology Engineer (L3/4) -System focuses on forth line support for escalated incidents and requests witha high level of complexity. Ensures contracted Managed Services outcomes aredelivered to the client.

This is a strategic role focusing across various technologydomains such as (but not limited to) Cloud, Security, Networking, Applicationsand / or Collaboration.

This role may also contribute to / support on project workas and when required.

Key Responsibilities:

. SystemStrategy and Architecture Ownership: Understand and define the technicalvision, standards, and multi-year roadmap for the organization's cloud andplatform infrastructure, ensuring highly scalable and cost-efficient designs.

. L3/4Technical Authority: Serve as the ultimate systems expert and final escalationpoint for highly complex, cross-platform stability issues, failures, andperformance bottlenecks, leading resolution and comprehensive Root CauseAnalysis (RCA).

. Advanced IaCDevelopment: Write, audit, and maintain mission-critical Infrastructure as Code(e.g., advanced Terraform modules) and custom platform tools (in Python/Go),driving the standardization of GitOps principles.

. Resilience& Reliability Engineering (SRE): Architect and implement patterns for faulttolerance, automated failover, and Disaster Recovery (DR). Define and enforcehigh standards for SLOs/SLIs across all platform services.

. Large-ScaleOrchestration: Design, optimize, and manage complex, multi-tenant Kubernetes(K8s) clusters (including networking, security, and upgrades), treating thecluster itself as the critical product.

. Governanceand Mentorship: Dictate and enforce system engineering and security standardsacross all CI/CD pipelines and infrastructure codebases, actively mentoringother senior engineers in platform design and engineering excellence.

  • Works independently, with general direction from the client, stakeholders, team lead, or senior manager, to perform operational tasks to resolve all escalated incidents/requests in a timely manner within the agreed SLA.
  • Timely and consistent updates of tickets with resolution tasks performed.
  • Proactively identifies, investigates, analyses issues and errors prior to or when they occur and log all such incidents in a timely manner.
  • Captures all required and relevant information for immediate resolution.
  • Provides forth level support to all escalated incidents, requests and identify the root cause of incidents and problems, respond to tickets where third line engineer teams were unable to fix the problem.
  • Shares such knowledge, to resolve issues, document them, and push the knowledge down to other engineers.
  • Communicates with other teams and clients for extending support. Acts as emergency support contact as needed, for critical client and business-impacting issues.
  • Supports, tracks, and documents change implementation.
  • Provides timely escalation of all tickets to management with ensuing updates, where applicable.
  • Proactively identifies, contributes, implements and works with automation teams for effort optimization and automating routine tasks.
  • Systematically gathers relevant information and applies technical knowledge to analyze and uses highly technical troubleshooting tools and content and analytical practices.
  • Uses operational and diagnostic procedures to resolve escalated tickers in unique and complex client environments.
  • Coaches L1, L2, and L3 teams offering technical expertise and pushing work down to other engineering teams.
  • Performs quality audits, covering process, service experience, ticket updates, etc. as required.
  • May manage and implement projects within technology domain, delivering effectively and promptly per client agreed upon requirements and timelines.
  • May work on implementing and delivering disaster recovery functions and tests.
  • Performs any other related task as required.

Knowledge and Attributes:

  • Ability to communicate and work across different cultures and social groups.
  • Ability to plan activities and projects well in advance, and takes into account possible changing circumstances.
  • Ability to maintain a positive outlook at work.
  • Ability to work well in a pressurized environment.
  • Ability to work hard and put in longer hours when it is necessary.
  • Ability to apply active listening techniques such as paraphrasing the message to confirm understanding, probing for further relevant information, and refraining from interrupting.
  • Ability to place clients at the forefront of all interactions, understanding their requirements, and creating a positive client experience throughout the total client journey.
  • Excellent proficiency in change management process with an ability to plan, monitor and execute changes with clear identification of risks and mitigation plans to be captured into the change record.
  • Deep technical skills in relevant functions.
  • Excellent client service orientation and passion for achieving or exceeding expectations.

Certifications and tools:

  • Certifications relevant to the services provided (certifications carry additional weightage on a candidate's qualification for the role).
  • Relevant certifications include (but not limited to) -
  • VMware certified Professional: DATA Centre Virtualization.
  • VMware Certified Specialist - Cloud Provider.
  • VMware Site recovery Manager: Install, Configure, Manage.
  • Microsoft Certified: Azure Architect Expert.
  • AWS Certified: Solutions Architect Associate.
  • Veeam Certified Engineer (VMCE).
  • Rubrik Certified Systems Administrator.
  • Storage HDS
  • VMware VRA
  • Ansible
  • Opswat, Splunk, Cyberark
  • Nvidia AI
  • SRE, Devops, Gitlab, Automation
  • Containers, etc.

Certification / Knowledge Item

Core Technology Focus / Relevance to Automation

Ansible - Primary Configuration Management tool for automating OS/application configuration and deployment.

SRE, Devops Principles Methodology for improving reliability, reducing toil, and driving the automation strategy.

Gitlab / Git Source Code Management (SCM) and CI/CD pipeline integration for all automation code.

Containers, etc. (Docker/K8s) - Expertise in containerization and Orchestration for modern, portable application deployment.

Microsoft Certified: Azure Architect Expert - Designing and automating highly available, scalable infrastructure on the Azure public cloud.

AWS Certified: Solutions Architect Associate - Designing and automating robust, secure solutions on the AWS public cloud.

VMware Certified Professional: DATA Centre Virtualization (VCP-DCV) - Foundational knowledge of vSphere for automating virtual machine and resource management.

VMware Certified Specialist - Cloud Provider - Focus on automating IaaS offerings and provider-level VMware solutions.

VMware Site Recovery Manager (SRM) - Automating disaster recovery and business continuity processes.

VMware VRA (vRealize Automation) - Building self-service catalogs and provisioning workflows (cloud/on-prem orchestration).

Veeam Certified Engineer (VMCE) - Automating backup, replication, and recovery workflows, especially for virtual environments.

Rubrik Certified Systems Administrator - Administering and automating data resilience and security platform operations.

Storage HDS - Automating provisioning and management tasks on Hitachi enterprise storage systems.

Splunk - Automating actions based on operational intelligence and security monitoring data.

Cyberark - Securing automation pipelines by managing and integrating privileged access credentials.

Opswat - Integrating security and compliance checks into automated workflows.

Nvidia AI - Skills related to automating the deployment and management of AI/ML infrastructure.

Infra VMWare - vRealize Automation (vRA)

Infra NetApp - NetApp Storage

Infra VEEAM

Veeam Backup

Veeam Backup & Replication

Veeam Backup Enterprise Manager

Infra VMWare

VMware Horizon (VDI)

Servers routine operation

Virtual Desktop Pool

Virtual Application maintenance

DEM configuration.

Server Management

vCenter

ESXi

vSphere

Infra Microsoft

Windows Server Update Services (WSUS)

Active Directory (AD)

Infra OpenGear - Console

Infra Linux

Operating System

Ubuntu

RHL

More Info

Job Type:
Industry:
Employment Type:

Job ID: 145565409

Similar Jobs