Job Responsibilities
- Responsible for the design and development of Lenovo LHCP system modules, including but not limited to: service health assessment, anomaly detection, capacity estimation, traffic forecasting, risk prediction, and root cause analysis of failures.
- Participate in / lead the design and development of algorithms for IT application operations, including: time series anomaly detection, log anomaly detection, multi-metric correlation analysis, time series forecasting, and service capacity estimation.
- Participate in/ lead the construction of a general intelligent operations platform for application systems.
- Leverage the company's infrastructure to provide foundational datasets, labelled data, data cleaning processes, model training, and runtime frameworks for the above use cases.
- Collaborate with product managers and Lenovo's internal IT operations teams to build automated operations capabilities applicable to multiple business scenarios.
- Research and apply technologies such as LLMs, rule engines, graph databases, and anomaly detection models to intelligent operations (AIOps) scenarios.
Job Requirements
Experience: 5+ years
Education: Bachelor's degree or above
- Proficient in Python. Knowledge of Java or Go is a plus.
- Familiar with common machine learning algorithms and deep learning frameworks such as TensorFlow / PyTorch, including but not limited to regression, classification, clustering, anomaly detection, time series forecasting, natural language processing and association rules.
- Familiar with generative AI technologies; candidates with hands-on experience in generative AI application development or large language model fine-tuning are preferred.
- Experience with anomaly detection, risk prediction, and capacity / traffic forecasting algorithms, or experience in IT operations development, is preferred.
- Strong communication skills, good English reading ability, and a strong spirit of exploration; able to identify cutting-edge technologies in industry and academia and effectively adapt and apply them in real-world scenarios.