Responsibilities
About the Team The global data center operation team supports the company's fast growth by operating hyperscale datacenters. The team manages the end to end lifecycle of server fleet, providing cloud solutions and various infrastructure services ensuring that they are scalable and are reliable. Responsibilities: 1. Oversee overall operation, maintenance and emergency response for global data center infrastructure. Provide remote emergency support, coordinate with on-site facility managers and local contacts, and liaise with business teams when service disruptions occur. 2. Escalate and report major risks in a timely manner, follow up on rectification progress, and coordinate internal and external resources to ensure full closed-loop management of O&M risks. 3. Drive the implementation, rollout and supervision of O&M management systems. Unify operational specifications across all data centers, and guide on-site teams on daily routine work. 4. Regularly monitor and analyze regional O&M metrics, conduct quality supervision and process audits. Identify potential operational risks, and continuously optimize workflows and service quality. 5. Support the capability building of data center infrastructure platforms, including data ingestion, data governance and alarm management.
Qualifications
Minimum Qualifications 1. Over 3 years of working experience in data center infrastructure O&M. Familiar with relevant emergency management procedures. Candidates with experience in large enterprises, global cross-team collaboration and end-to-end fault coordination are preferred. 2. Good risk identification capability. Able to accurately assess major risks and complete standardized escalation and reporting. Possess closed-loop management awareness, as well as strong comprehensive analysis and problem-solving skills. 3. Familiar with management workflows and O&M systems for data center infrastructure. Experience in developing large-scale infrastructure O&M systems is a plus. Strong data awareness and O&M quality control skills proficient in KPI tracking, data analysis and fault troubleshooting. 4. Sound logical thinking and a strong sense of responsibility. Skilled in office and data tools, with standard documentation habits. Able to compile and update workflows and technical documents in a timely manner as required.