Draftwise is building AI that makes contract work faster and more precise for the world's leading law firms by executing complex drafting, review, and search workflows directly within Microsoft Word, drawing from your organization's best guidance, precedent, and language.
Requirements
- 1+ years in applied NLP/ML or data science (or equivalent), including experience with LLM‑driven agents and traditional ML.
- Demonstrated ability to design experiments, define metrics, and make statistically sound decisions (A/B testing, power analysis, regression testing, error analysis).
- Hands‑on with evaluation dataset design and labeling workflows; strong instincts for dataset quality and drift.
- Fluency with Python, modern ML/LLM tooling, data pipelines, and production metrics analysis.
- Background in information retrieval or ranking for RAG.
- Experience training and shipping targeted fine‑tunes (SFT, RFT) and measuring their latency/cost/quality trade‑offs.
- Experience with function calling, structured tool‑use, routing, and guardrails.
Responsibilities
- Run rigorous, data‑driven experiments on complex agentic tasks (tool‑use, retrieval, drafting, review, redlining, summarization, citation).
- Operationalize success by working with legal SMEs to define task‑level objectives, guardrails, and failure taxonomies.
- Build and maintain evaluation suites (labeled datasets, prompts, harnesses, and regression tests) that demonstrate consistent, statistically significant improvement over time.
- Instrument and analyze production metrics to diagnose agent behavior (e.g., tool‑selection errors, hallucination modes, slow paths, non‑determinism) and triage issues by impact.
- Propose and test mitigations (workflow redesign, routing, guardrails, retrieval changes, prompt refactors, function‑calling updates).
- Identify high‑value candidate components for fine‑tuning (e.g., routing models, classification/refusal heads, RAG re‑rankers, drafting subtasks).
- Train and evaluate fine‑tuned models (e.g., SFT/RFT/DPO) to improve task quality and reduce latency/cost.
Other
- Availability to work in Eastern US timezone
- Clear written and verbal communication with both technical and non‑technical partners.
- Strong communication skills in an open environment.
- The ability to work independently and make informed decisions with minimal supervision.
- Interest in working in a dynamic environment with dynamic objectives.