Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

ByteDance Logo

Software Engineer Graduate - Cloud Native Infrastructure

ByteDance

Salary not specified
Aug 20, 2025
San Jose, CA, USA
Apply Now

ByteDance is facing the challenge of enhancing resource cost efficiency on a massive scale within its rapidly growing compute infrastructure, which powers global platforms like TikTok and various AI/ML & LLM initiatives. The company seeks to optimize its infrastructure for AI & LLM models to better utilize computing resources (CPU, GPU, power, etc.), directly impacting the performance of all AI services and building the future of computing infrastructure.

Requirements

  • Proficiency in at least one major programming language such as Python, Go, C++, Rust, and Java.
  • Solid understanding of at least one of the following fields: Unix/Linux environments, distributed and parallel systems, high-performance networking systems, developing large scale software systems
  • Hands-on project experience with container and orchestration technologies such as Docker and Kubernetes through internships, coursework, or personal projects.
  • Experience in developing or contributing to cloud-native open-source projects.

Responsibilities

  • Assist in analyzing and supporting enhancements to Hyper-Scale AI Infrastructure platforms, focusing on improving performance, scalability, and resilience for both traditional workloads and large language model (LLM) applications.
  • Contribute to performance optimization efforts for Kubernetes-based infrastructure, including monitoring pod lifecycle, tracking resource utilization, and analyzing system behavior under varying load conditions—working closely with senior engineers to identify improvement opportunities.
  • Lead small-scale development tasks related to resource management and scheduling in Kubernetes clusters, such as testing configuration updates, automating routine resource allocation workflows, or contributing to tooling for efficiency tracking.
  • Engage actively in team discussions on AI infrastructure design and optimization strategies, leveraging academic knowledge and personal projects to propose fresh insights and potential solutions.
  • Develop and maintain clear technical documentation, including runbooks, architecture diagrams, and process guides, to strengthen knowledge sharing and operational efficiency across the team.

Other

  • Successful candidates must be able to commit to an onboarding date by end of year 2026.
  • Please state your availability and graduation date clearly in your resume.
  • Candidates can apply to a maximum of two positions and will be considered for jobs in the order you apply.
  • Applications will be reviewed on a rolling basis.
  • We encourage you to apply as early as possible.