Search by job, company or skills

B

Site Reliability Engineer - AI Application

5-7 Years
Save
new job description bg glownew job description bg glow
  • Posted 2 months ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Responsibilities

About the team We are an AI-driven search and recommendation team focused on building innovative, scalable products for global users. Responsibilities 1. Ensure the reliability and normal operation of multiple core systems related to Viking Team's Big data and online services, while focusing on system capacity planning and stability assurance 2. Enhance system visibility by monitoring the availability and performance metrics of system components, helping development teams quickly locate faults, and especially ensuring operation of critical links such as AI search/vector databases 3. Improve the reliability, scalability, and Performance optimization of services to ensure the achievement of the core system SLA 4. Participated in the design and implementation of the automation platform, ensuring the rapid iteration and efficient operation and maintenance of large-scale online Viking clusters and AI search-related clusters 5. Combining with the usage scenarios of AI Search/Viking business, in-depth optimization of service governance practices, including but not limited to analysis of performance bottlenecks in key AI Search/Viking links, business problem location and troubleshooting, promoting the transformation and upgrading of the system's high-availability architecture, and those familiar with Viking-related technologies are preferred to participate in core optimization work.

Qualifications

Minimum Qualifications 1. Bachelor's degree or above, majoring in computer-related fields, with more than five years of relevant work experience 2. Has a solid foundation in computer software knowledge, and understands the relevant principles of Linux operating systems, storage, network IO, etc. 3. Familiar with at least one programming language (such as Python/Go/Java/Shell/Ansible), with moderate development capabilities, and placing more emphasis on operations and maintenance practices and problem-solving abilities 4. Understand at least one type of knowledge related to cloud infrastructure such as AWS/Volcano Engine/Aliyun/GCP those with experience in computing/distributed systems are preferred (e.g., Nginx/Kubernetes/Docker/OpenStack/Hadoop/Spark/Flink, etc.) Preferred Qualifications: 1. Familiar with algorithmic thinking, good data structure and system design capabilities 2. Have certain understanding of AI Cloud, large model-related Search Suggestion, and Recommender system.

More Info

Job Type:
Employment Type:

About Company

ByteDance is a technology company operating a range of content platforms that inform, educate, entertain and inspire people across languages, cultures, and geographies.
Dedicated to building global platforms of creation and interaction, ByteDance now has a portfolio of applications available in over 150 markets and 75 languages. For example, TikTok, Helo, Vigo Video, Douyin, and Huoshan.
Dedicated to building global platforms of creation and interaction, ByteDance now has a portfolio of applications available in over 150 markets and 75 languages. For example, TikTok, Helo, Vigo Video, Douyin, and Huoshan.

Job ID: 138340365

Similar Jobs

Singapore

Skills:

NginxJavaHadoopOpenStackShellGcpDockerAnsibleSparkKubernetesPythonAWSVolcano EngineGoFlinkLinux operating systemsAliyun

Singapore

Skills:

NginxJavaHadoopOpenStackShellGcpLinuxDockerAnsibleSparkKubernetesPythonAWSVolcano EngineGoFlinkAlibaba Cloud