OpenAI is looking to detect and disrupt abuse and strategic risk so people can use their products safely, by building a first-of-its-kind user-risk measurement function
Requirements
- Strong in sampling, inference, uncertainty quantification, and rare-event estimation; comfortable with time-varying metrics
- Write solid Python and SQL; are fluent with data warehouses and productionizing notebooks/pipelines
- Experience with Airflow DAGs or other ETL pipelines
- Experience with Databricks
- Experience with survival analysis
- Experience with streaming/online detection
- Experience with classifier evaluation/QA
Responsibilities
- Define the measurement framework for user-level risk across products and cohorts: scope the questions that matter and align on clear, policy-grounded definitions
- Establish baselines and statistical confidence for core metrics: prevalence, intensity, trends, and cohort dynamics
- Build decision-ready reporting surfaces: executive dashboards, weekly briefs, and launch readouts that translate insights into action
- Clean and organize ambiguous data from disparate sources, with an eye toward building automated pipelines and systems
- Create attribution and change-tracking: connect shifts in user behavior to mitigations, product changes, and external events
- Partner across Safety Systems, Data Science, Integrity, Product, and Policy: ensure one coherent analytics entry point and consistent standards
- Uphold quality, privacy, and governance: document methods, ensure auditability, and maintain durable measurement hygiene
Other
- 3–6+ years in data science, measurement/causal inference, or risk analytics in high-stakes domains
- Communicate crisply, translating complex estimators into clear actions for executives and cross-functional partners
- Relocation support is available for the role based in San Francisco, CA (hybrid, 3 days/week)
- Must be able to work in a hybrid environment
- Must be able to maintain the confidentiality of proprietary, confidential, and non-public information