OpenAI's Safety Systems team is responsible for ensuring their AI models can be safely deployed to the real world. The lead researcher for cybersecurity risks will design, implement, and oversee an end-to-end mitigation stack to prevent severe cyber misuse across OpenAI’s products, ensuring safeguards are enforceable, scalable, and effective.
Requirements
- Bring demonstrated experience in deep learning and transformer models.
- Are proficient with frameworks such as PyTorch or TensorFlow.
- Possess a strong foundation in data structures, algorithms, and software engineering principles.
- Are familiar with methods for training and fine-tuning large language models, including distillation, supervised fine-tuning, and policy optimization.
- Have significant experience designing and deploying technical safeguards for abuse prevention, detection, and enforcement at scale.
- Bring background knowledge in cybersecurity or adjacent fields.
Responsibilities
- Lead the full-stack mitigation strategy and implement solutions for model-enabled cybersecurity misuse—from prevention to monitoring, detection, and enforcement.
- Integrate safeguards across products so protections are consistent, low-latency, and scale with usage and new model surfaces.
- Make decisive technical trade-offs within the cybersecurity risk domain, balancing coverage, latency, model utility, and user privacy.
- Partner with risk/threat modeling leadership to align mitigation design with anticipated attacker behaviors and high-impact scenarios.
- Drive rigorous testing and red-teaming, stress-testing the mitigation stack against evolving threats (e.g., novel exploits, tool-use chains, automated attack workflows) and across product surfaces.
Other
- Have a passion for AI safety and are motivated to make cutting-edge AI models safer for real-world use.
- Excel at working collaboratively with cross-functional teams across research, security, policy, product, and engineering.
- Show decisive leadership in high-stakes, ambiguous environments.