Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Research Engineer Intern - Doubao - Seed - Machine Learning System - 2025 Summer - MS

Salary not specified

May 9, 2025

San Jose, CA, USA

The company needs to develop and improve its machine learning systems and distributed training jobs.

Familiar with machine learning algorithms, platforms and frameworks such as PyTorch and Jax
Have basic understanding of how GPU and/or ASIC works
Expert in at least one or two programming languages in Linux environment: C/C++, CUDA, Python
Preferred: GPU based high performance computing, RDMA high performance network (MPI, NCCL, ibverbs)
Preferred: Distributed training framework optimizations such as DeepSpeed, FSDP, Megatron, GSPMD
Preferred: AI compiler stacks such as torch.fx, XLA and MLIR
Preferred: Large scale data processing and parallel computing

Research and develop our machine learning systems, including architecture, management, scheduling, and monitoring
Manage cross-layer optimization of system and AI algorithms and hardware for machine learning
Improve efficiency and stability for extremely large scale distributed training jobs

Currently pursuing a MS in Software Development, Computer Science, Computer Engineering, or a related technical discipline
Must obtain work authorization in country of employment at the time of hire, and maintain ongoing work authorization during employment