The Vision Engineering Team at TikTok is looking to solve the problem of efficiently delivering GenAI technologies into TikTok products worldwide by streamlining the creation, integration, testing, and deployment of GenAI features, including large-scale training stability, optimization for acceleration, and large model inference and multi-machine multi-card deployment.
Requirements
- Proficient in C++/Python and high-performance coding.
- Expertise in diffusion models (Stable Diffusion/DiT) with deep understanding of computational bottlenecks and optimization methodologies.
- Proven experience in ≥1 domain: model compression (quantization/knowledge distillation), efficient architectures (MoE/sparse attention), generative alignment (RLHF/DPO).
- Kaggle competition achievements, publications at ICML/NeurIPS/CVPR, or open-source contributions (e.g., HuggingFace Diffusers optimization).
- Research experience in GenAI /MLsys areas.
- Familiarity with open source deep learning frameworks such as Pytorch/DeepSpeed/Jax etc.
Responsibilities
- Develop algorithm acceleration technologies for text-to-image/text-to-video models through knowledge distillation, model architecture redesign (dynamic MoE routing/sparse attention), and parameter-efficient design (low-bit quantization) to achieve order-of-magnitude efficiency gains.
- Lead generative model innovation with focus on diffusion acceleration (sampling step reduction/latent optimization), autoregression model efficiency.
- Collaborate cross-functionally to identify performance bottlenecks, optimize vision models via algorithmic breakthroughs, and enhance ByteDance's product capabilities.
Other
- Final year Ph.D or recent Ph.D graduates in Computer Science, engineering or quantitative field
- Excellent communication and teamwork skills, capable of thriving in a fast-paced work environment.
- Successful candidates must be able to commit to an onboarding date by end of year 2026. Please state your availability and graduation date clearly in your resume.