
Search by job, company or skills
ALTEN Group is a world leader in Engineering and Technology consulting services providing outsourced Engineering, R&D, and IT Services for different industries such as Transportation, Defence, Energy and Security with 55,000 engineers in nearly 35 countries. ANOTECH is the subsidiary of the Group delivering ALTEN's Engineering Services in Singapore.
As a Level 2 Engineer, you will participate in advanced troubleshooting, in-depth analysis of problems, and implementing solutions that may involve more specialized knowledge for a mission-critical 24x7 system.
Responsibilities:
Operational Support
. Provide level 2 support operations for mission-critical system andinfrastructure
. Provide troubleshooting and diagnostics for incidents escalated fromlevel 1
. Ensure adherence to SLA, system availability
. Perform OS patching for Windows Servers and Red Hat Enterprise Linuxmanually or by RHEL Satellite or Microsoft WSUS
. Administer Microsoft Hyper-V and Azure Stack, including Virtual Machineprovisioning, performance tuning and resource optimization
. Provide network configuration & support for Cisco switches, wirelessLAN controllers
Application Support
. Investigate and resolve application incidents escalated from Level 1perform root cause analysis and workarounds where possible
. Monitor application logs, integration points such as REST API, message queues,file-based transfer
. Perform application health checks to proactively identify issues andensure uptime
. Liaise with Level 3 to resolve complex application issues and escalatebugs or enhancement requests
. Support and maintain job schedulers, interface configurations andintegration points
. Maintain and support application configurations, environment-specificsettings and integration parameters
. Document known issues, resolution procedure, rollback in the knowledgebase
Incident & Problem Management
. Resolve P1/P2 issues within SLA
. Perform resolution and communications
. Perform root cause analysis and recommend permanent fixes
. Escalate unresolved issues that required software coding to Level 3 orengineering teams
. Ensure proper closure of incident and problem
Change Management
. Perform operational impact assessment
. Present change in Change Advisory Board
. Pre-Change Preparation such as review Change Request and Release Plan
. Documentation update in the knowledge base
. Post change review and feedback
Patch Management
. Perform patch management readiness
. Stakeholder coordination and team coordination
. System Readiness and Post-Patch Validation
. Documentation update and knowledge transfer
. Compliance and audit readiness
Documentation and Compliance
. Operational documentation. SOPs, Incident response checklist, RCA, PIR,monitoring and alert guidebook
. Configuration & Infrastructure Documentation. System configurationbaseline, application dependency maps, environment inventories such as hosts,services, accounts
. Knowledge Base Articles for level 2 enablement and faster resolutione.g. Known Errors and Fixes, Frequent How-To Guides, Script Repositories,Lessons Learned
. Maintain application documentation
. Knowledge Management
Configuration Management
. Perform validation and accuracy of configurations
. Maintain readiness of operational documentation
. Perform audit to confirm compliance of configurations
. CMDB asset verification
. Change-linked configuration tracking
. Ensure environment consistency between DEV - IVVQ - ISO-PROD - UAT andPROD
Testing and Verification
. Ensure operational readiness testing before production deploymentrollout
. Ensure post-change verification coordination
. Perform regression and sanity test following patching or upgrades, inUAT and PROD
. Participation in user acceptance testing
Knowledge Management
. Documentation of resolution
. Knowledge Base Contribution
. Validation of knowledge
. Subject Matter Expertise Sharing
Root Cause Analysis
. Gather logs, system metrics at the time of failure
. Reproduction of issues in a controlled environment to understand theconditions under which it occurs
. Determine the scope and severity in terms of the systems affected,downtime duration and business impact
. Narrow down the possible sources of causing the failure
. Use of diagnostic tools such to analyse the application behaviour
. Correlation of events to sequence the chain of events leading up to thefailure and identify the dependencies
Work Schedule
. Require rotational on-call duty support
. Available for graveyard hours change request deployment as scheduled
Requirements:
. Bachelor Degree in InformationTechnology, Computer Science, Engineering, or a closely related discipline
. At least 5 years in Level 2support for mission critical 24x7 production support, preferably in publicsector
. At least 2 years in a teamlead or supervisory role, coordinating tasks and mentoring junior engineers
. Proven experience inhandling P1/P2 incidents, managing post-incident reviews (PIRs) and root cause analysis
. Preferably certification inRed Hat Enterprise Linux or Kubernetes
. Handson technical workingexperience in the following:
. Operating Systems. RHEL (90%) andWindows Server (10%)
. Networking Fundamentals
. Middleware & Infrastructure(Web Server - Nginx, App Servers - Kubernetes with containers (Docker + SpringBoot)
. Message Queues (IBM MQ, Kafka)
. Java, C#, MQTT, Golang
. Database (SQL Server, PostgreSQL)
. ITIL/ITSM Process Knowledge
. Security Awareness
. DR and HA concepts
. Strong Technical Skills
. Leadership & Coordination
. Communication & Collaboration
. Operational Governance
. Mustbe eligible to obtain G50 security clearance
Job ID: 146180343