Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Open AI Logo

ML Evals Engineer

Open AI

Salary not specified
Oct 14, 2025
San Francisco, CA, US
Apply Now

OpenAI is looking to solve the problem of designing deeply personal, multimodal experiences that make advanced AI feel natural, useful, and human, and to create reliable, insightful metrics to measure model and product quality across the full stack.

Requirements

  • Hands-on experience building tools or pipelines around LLMs or multimodal models
  • Proficient in Python for backend/data workflows
  • Familiar with TypeScript/React or similar frameworks for visualization
  • Experience with evaluation or visualization of multimodal models (speech, vision, or sensors)
  • Familiarity with hardware prototyping or embedded ML
  • Background in human-in-the-loop evaluation or UX research tooling

Responsibilities

  • Design and implement extensible evaluation harnesses for multimodal tasks spanning speech, vision, and text
  • Build interactive visualization and analysis tools that help engineers, designers, and researchers inspect model and UX performance
  • Empower product and design teams to define and extend evaluation suites aligned with real-world usage and product vision
  • Automate continuous evaluation and regression tracking to ensure each model and hardware iteration improves the experience
  • Collaborate with hardware, software, research, and design teams to turn qualitative goals into quantitative evaluation metrics

Other

  • 4 days per week onsite in San Francisco, CA
  • Relocation assistance to new employees
  • Equal opportunity employer, with no discrimination on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic
  • Committed to providing reasonable accommodations to applicants with disabilities
  • Must be able to protect computer hardware entrusted to you from theft, loss or damage and maintain the confidentiality of proprietary, confidential, and non-public information