Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

ByteDance Logo

Research Engineer Intern - Doubao - Seed - Machine Learning System - 2025 Summer - MS

ByteDance

Salary not specified
May 9, 2025
San Jose, CA, USA
Apply Now

The company needs to develop and improve its machine learning systems and distributed training jobs.

Requirements

  • Familiar with machine learning algorithms, platforms and frameworks such as PyTorch and Jax
  • Have basic understanding of how GPU and/or ASIC works
  • Expert in at least one or two programming languages in Linux environment: C/C++, CUDA, Python
  • Preferred: GPU based high performance computing, RDMA high performance network (MPI, NCCL, ibverbs)
  • Preferred: Distributed training framework optimizations such as DeepSpeed, FSDP, Megatron, GSPMD
  • Preferred: AI compiler stacks such as torch.fx, XLA and MLIR
  • Preferred: Large scale data processing and parallel computing

Responsibilities

  • Research and develop our machine learning systems, including architecture, management, scheduling, and monitoring
  • Manage cross-layer optimization of system and AI algorithms and hardware for machine learning
  • Improve efficiency and stability for extremely large scale distributed training jobs

Other

  • Currently pursuing a MS in Software Development, Computer Science, Computer Engineering, or a related technical discipline
  • Must obtain work authorization in country of employment at the time of hire, and maintain ongoing work authorization during employment