Responsibilities
About the Team The team builds and operates large-scale, massively distributed infrastructures, applying Site Reliability Engineering (SRE) principles of software and systems engineering to ensure our traffic services are reliable, fault-tolerant, efficiently scalable, and cost-effective. You will have the opportunity to manage a variety of complex systems at scale, including traffic systems serving hyperscale datacenters and public cloud environments, and a global load balancer that handles Tbps of traffic. We build and operate multi-cloud-based, large-scale network services around the world to accelerate and optimize network traffic for TikTok and a variety of application services for ByteDance internal customers. These services include, but are not limited to, Layer 4 load balancing, Layer 4/7 acceleration, global ingress, CMAF, FaaS, and WAF. By joining us, you can work within a brilliant team and learn how to build a TikTok-scale network traffic platform serving billions of users globally. Responsibilities - Build, expand and operate ByteDance's global traffic platform, including large-scale systems in public and private clouds, edge data centers. - Build tools, automations, visualizations and monitors to facilitate the operation and optimization of the global traffic platform. - Work in a fast-paced environment. Participate in technical operations and rotations in response to performance and reliability issues. - Help improve the whole lifecycle of infrastructure services from inception and design throughout development, to deployment, user support and refinement.
Qualifications
Minimum Qualifications - Bachelor's or Master's degree in Computer Engineering, Electrical Engineering, Computer Science or related major. - Proven years experience working with Linux systems from kernel to shell and beyond with experience working with system libraries, file systems, and client-server protocols. - At least 3 years experience in one or more programming languages such as Go, Python and Shell script. - Familiar with Cloud and CI/CD framework/Tools, such as GIT, Docker, Kubernetes, etc. Preferred Qualifications - Experience in designing, analyzing and building automation and tools for large scale systems - Experience in building solutions with AWS, Google, Azures and other cloud services. - Experience in networking technologies such TCP/IP, HTTP, DNS, etc. in a carrier-grade environment. - Experience in developing and operating one or more of following systems: Kubernetes, Nginx, ipvs, ELK stack, etc. - Self-driven and capable of coping with ambiguity and moving projects from concept to delivery. - Strong in analytical skills and the ability to solve real world problems in a fast moving environment.