Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

TikTok Logo

Tech Lead Site Reliability Engineer - Edge - USDS

TikTok

$187,040 - $359,720
Sep 20, 2025
San Jose, CA, USA
Apply Now

Ensure infrastructure services are reliable, fault-tolerant, efficiently scalable and cost-effective for TikTok's Edge SRE team.

Requirements

  • 3+ years experience working with Unix Linux systems from kernel to shell and beyond with experience working with system libraries, file systems, and client-server protocols.
  • 2+ years experience in one or more programming languages such as Java, C++, Go, or scripting experience in Shell and Python.
  • Experience in designing, analyzing and building automation and tools for large scale systems.
  • Experience with the Hadoop ecosystem - HDFS, Yarn, Spark, etc.
  • Experience in building solutions with AWS, Google, Azures and other cloud services.
  • Experience in networking technologies such TCP/IP, BGP, DNS, etc. in a carrier-grade environment.
  • Experience in developing and operating one or more of following systems: OpenStack, Kubernetes, Nginx, ipvs, ELK stack, Hadoop, etc.

Responsibilities

  • Build data pipelines, tools, automations, visualizations and monitors to facilitate the operation and optimization of edge services.
  • Data monitoring and alerting, data quality assurance and anomaly detection.
  • Document team processes and policies, including methods of engagement and SLOs.
  • Analyze, design and implement solutions at the system level to remove bottlenecks and improve edge service performance.
  • Implement monitoring and alerting to improve issue detection and response.
  • Work in a fast-paced environment.
  • Participate in technical operations and rotations in response to performance and reliability issues.

Other

  • Master’s degree (or Bachelor's degree with 2+) years of experience in Computer Engineering, Electrical Engineering, Computer Science or related major.
  • Strong analytical skills and the ability to solve real world problems in a fast moving environment.
  • Self-driven and capable of working with ambiguity and moving projects from concept to delivery.
  • Our organization follows a hybrid work schedule that requires employees to work in the office 3 days a week, or as directed by their manager/department.
  • This role requires the ability to work with and support systems designed to protect sensitive data and information. As such, this role will be subject to strict national security-related screening.