Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Microsoft Logo

Senior Data Scientist

Microsoft

$119,800 - $234,700
Oct 30, 2025
Redmond, WA, United States of America
Apply Now

Microsoft 365 Copilot quality improvement through customer feedback and evaluation datasets

Requirements

  • Experience with building data pipelines, performing large-scale analysis, and implementing ML workflows using Python and SQL.
  • Experience in developing models or designing evaluation frameworks, including A/B testing or prompt-based assessments for LLMs.
  • LLM fundamentals: prompt engineering, few‑shot design, retrieval metrics, multi‑turn/agent trace evaluation.
  • Data quality mindset: trace hygiene, metadata design, policy/PII awareness, and principled guardrails.
  • Experience building graders that score persona/tone, contract/formatting (e.g., JSON validity, schema), and tool‑use correctness.
  • Background with structured synthetic data generation and vendor annotation programs; familiarity with judge mutation/optimization loops.
  • AI & Technical Fluency: You don't need to train models, but you know how they work, how to test them, and how to build great products on top of them.

Responsibilities

  • Evaluation & Feedback Analysis
  • Convert multi‑source feedback (dogfood, VIP customers, production traces) into a prioritized dataset of 10–100 tasks per scenario, each with prompts and golden outputs; maintain a living failure taxonomy prioritized by volume × impact × fixability.
  • Rubrics & LLM‑as‑Judge
  • Author crisp, binary‑first rubrics across 7–30 dimensions (e.g., correctness/completeness, refusal calibration, tool‑use quality, formatting/contract, persona/tone, trace hygiene).
  • Build grader prompts (with few‑shots and counter‑examples) that achieve ≥80% human‑match rate, track TPR/TNR on held‑out sets, and prevent reward hacking.
  • Synthetic & Human‑Labeled Data
  • Design structured tuples to scale high‑signal synthetic data; orchestrate vendor/partner annotation sprints and live calibrations to align shared judgment.
  • Ensure datasets are reproducible with linked artifacts and robust metadata/trace hygiene.
  • Customer‑Grounded Scenarios

Other

  • Doctorate in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 1+ year(s) data-science experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role.
  • Ability to work in a fast-paced, ambiguous environment and deliver results under tight deadlines.
  • 2+ years customer-facing, project-delivery experience, professional services, and/or consulting experience.
  • Experience in communication and stakeholder management skills.