OpenAI's Intelligence & Investigations (I2) team needs to establish a user-risk measurement function to understand how users interact with frontier AI and the impact of safety mitigations. This involves creating policy-grounded baselines, confidence intervals, and attribution to inform executive decisions and product direction.
Requirements
- Have 7+ years in data science, measurement/causal inference, forecasting, modeling, or risk analytics in high-stakes domains
- Have deep strength in sampling, inference, probability sampling, time series data, backtesting, parametric/non-parametric modeling approaches, and imputation, and uncertainty for rare-event estimation; comfort with time-varying metrics and survival analysis
- Write strong Python and SQL; are fluent with modern warehouses and notebook-to-production workflows; communicate crisply to executives and engineers
- Bring experience in integrity/fraud/safety or adjacent high-stakes analytics; have led multiple complex workstreams and mentored senior peers
- Experience in managing data scientists, quantitative analysts or similar
- Balance near-term execution with long-term vision; raise analytical standards and create clarity across teams
- Bonus: Familiarity with Airflow
Responsibilities
- Define the measurement and forecasting strategy and operating model; align policy-grounded definitions, governance, and quality bars across partners
- Build user-level baselines and confidence intervals for rare-event harms using principled sampling and inference; institute stability and drift checks
- Ship executive-grade reporting: dashboard tiles, weekly 1-pagers, monthly deep dives, and launch/post-launch readouts that drive action
- Implement mitigation attribution and change-tracking; back-test launches and connect outcomes to specific interventions and external events
- Own data interfaces and SLOs across DS schemas; ensure privacy-by-design data paths and auditable method notes
- Build automated systems and pipelines to clean and organize unstructured data from disparate sources
- Act as the single analytics entry point with cross-functional partners across our Safety Systems, Data Science, Integrity, Product, and Policy teams; resolve definitions, standards, and timelines
Other
- Build and lead the team; mentor experienced ICs; foster an inclusive, principled, high-standards culture
- This role is based in San Francisco, CA (hybrid, 3 days/week).
- Relocation support is available.
- communicate crisply to executives and engineers
- have led multiple complex workstreams and mentored senior peers