ByteDance's Seed LLM Code Generation Team is looking to comprehensively enhance the model's coding capabilities and build a bridge for AI to interact with the digital world, specifically by developing a Scientific Coding Agent that plans, writes, debugs, and executes scientific code to accelerate physics/chemistry/Biology research.
Requirements
- Strong Python and software engineering: Git, testing (pytest), packaging, comfort with Linux/containers.
- Working knowledge of LLMs (prompting, fine‑tuning or adapters, evaluation) and at least one ML framework (eg. PyTorch).
- Solid foundations in physics and/or chemistry (e.g., classical/quantum/thermo/stat mech; physical chemistry, molecular modeling) and numerical methods (ODE/PDE, optimization, linear algebra).
- Reinforcement Learning (PPO/DPO, reward modeling, curriculum learning)
- Comfort with Linux/containers
- Git
- PyTorch
Responsibilities
- Benchmarking for scientific coding (core) - Design a principled benchmark suite spanning tasks with literature-grounded coding and issues solving for existing codebase.
- Benchmarking for scientific coding (core) - Curate tasks/datasets and write robust unit tests with compliance of scientific knowledge.
- Benchmarking for scientific coding (core) - Implement evaluation metrics and build a clean, extensible evaluation pipeline with baselines.
- Language evolution for new scientific discovery (core) - Prototype loops where agents propose/refine scientific language and tie these to executable code & simulations.
- Language evolution for new scientific discovery (core) - Build workflows (multi-agents, planning, tool-use, self‑critique), integrate code execution sandboxes, retrieval, and experiment runners.
- Language evolution for new scientific discovery (core) - Compare LLMs and inference strategies; run ablations and produce clean research artifacts (plots, tables, write-ups).
- Explore agent evolution mechanisms that enable new scientific discoveries.
Other
- Currently pursuing a PhD in Computer Science, Machine Learning, Programming Systems, Physics, Chemistry, Biology, Applied Math, or related field.
- State your availability clearly in your resume (Start date, End date)
- Collaboration with industry experts
- Hands-on learning, enriching community-building and development events
- Apply early as applications will be reviewed on a rolling basis