The Global E-commerce SRE team of US Tech and Product is looking to build and run large-scale, globally distributed, observable, fault-tolerant systems for TikTok's E-commerce platform.
Requirements
- 5-7 years of experience writing code in Java, Go, Python or a similar language.
- Understanding of Unix/Linux operating systems internals and networking
- Experience with algorithms, data structures, complexity analysis and software design
- Experience developing tools and APIs to reduce manual interaction with systems and applications using a variety of coding and scripting standards
- Familiarity with running production grade web services at scale and understanding cloud native technologies and networking
Responsibilities
- Support the service level of a critical, revenue generating E-commerce platform as well as related infrastructure and services.
- Implement SRE practices around incident management, post-mortems while being part of on-call rotations.
- Define service level indicators and data-driven objectives to uphold and improve uptime, latency, and system health of a core TikTok production platform.
- Collaborate cross team with engineering and product to ensure that key requirements (such as capacity planning and launch reviews) are performed to enable transparent service delivery to customers.
- Automation geared towards efficiency, scalability and service resiliency
Other
- Bachelor's or master's degree in engineering, computer science, or similar
- Hybrid work schedule that requires employees to work in the office 3 days a week, or as directed by their manager/department
- Ability to work with and support systems designed to protect sensitive data and information
- Must be eligible for strict national security-related screening
- Must be willing to work in Los Angeles County (unincorporated) and comply with the Los Angeles County Fair Chance Ordinance for Employers and the California Fair Chance Act