The mission of our AML team is to push the next-generation AI infrastructure and recommendation platform for the ads ranking, search ranking, live & ecom ranking in ByteDance.
Requirements
- Have distributed system or other infrastructure system experience
- CUDA, Compiler, AND/OR C++ experience
- Computer architecture (CPUs, Memory Storage, microarchitecture, etc.), or hardware infrastructure experience
- ML/Deep learning Frameworks: GPUs/TPUs along with Tensorflow/Pytorch
- Experience contributing to an open sourced machine learning framework (TensorFlow/PyTorch), experience on improving core machine learning infrastructure.
- Strong background in one of the following fields: Hardware-Software Co-Design, High Performance Computing, ML Hardware Acceleration (e.g., GPU/TPU/RDMA) or ML for Systems.
- Experience in developing and deploying large-scale systems (e.g. Monitoring, Analyzing, Troubleshooting, and Notification systems)
Responsibilities
- Responsible for the design and implementation of a global-scale machine learning system for feeds, ads and search ranking models.
- Responsible for improving use-ability and flexibility of the machine learning infrastructure.
- Responsible for improving the efficiency of model training and serving.
Other
- Currently pursuing a MS in Software Development, Computer Science, Computer Engineering, or a related technical discipline
- Commit to an onboarding date by end of year 2026
- State your availability and graduation date clearly in your resume
- Bachelor/Master Graduates- 2026 Start
- 10 paid holidays per year, 10 paid sick days per year and 17 days of Paid Personal Time (prorated upon hire with increasing accruals by tenure)