ByteDance DPU team is building foundational computing infrastructure for ByteDance and Volcano Engine Public Cloud, aiming to advance the architecture, development, and research of next-generation software-hardware technologies across compute, networking, and storage for cloud and AI computing.
Requirements
- Proficiency in C/C++ development and debugging.
- Strong Linux systems development experience.
- Solid understanding of compute, network architecture, and operating systems.
- Background in at least one of: software-hardware co-design, distributed systems, high-performance networking, or AI/ML systems.
- Experience with software-hardware co-design (networking, storage, or distributed compute).
- Hands-on experience with network virtualization (OVS, SR-IOV, eBPF).
- Familiarity with DPDK and high-performance user-space networking.
Responsibilities
- Design and develop DPU network software with a focus on high performance, low latency, and reliability.
- Collaborate with hardware teams to build software-hardware co-design solutions for networking and storage acceleration.
- Explore AI/ML infrastructure acceleration, leveraging DPUs, GPUs, and custom hardware to optimize distributed training and inference.
- Drive end-to-end performance optimization, from OS kernels and drivers to user-space runtime systems.
- Contribute to architecture design, technical proposals, and long-term research directions.
Other
- B.S./M.S. in Computer Science, Computer Engineering, or related fields; or Ph.D. with strong research/publications.
- 2+ years of relevant industry experience (exception for Ph.D. with strong background).
- Ph.D. in related fields with research training and publications.
- Bonus points for hardware acceleration experience, FPGA/ASIC/GPU/CUDA
- Bonus points for experience with NCCL Collectives along with AI communication patterns and parallelization techniques