Developing techniques to minimize hallucinations and enhance truthfulness in language models.
Requirements
- Strong programming skills in Python
- Industry experience with language model finetuning and classifier training
- Proficiency in experimental design and statistical analysis for measuring improvements in calibration and accuracy
- Experience in data science or the creation and curation of datasets for finetuning LLMs
- Understanding of various metrics of uncertainty, calibration, and truthfulness in model outputs
- Published work on hallucination prevention, factual grounding, or knowledge integration in language models
- Experience with retrieval-augmented generation (RAG) or similar fact-grounding techniques
Responsibilities
- Design and implement novel data curation pipelines to identify, verify, and filter training data for accuracy given the model’s knowledge
- Develop specialized classifiers to detect potential hallucinations or miscalibrated claims made by the model
- Create and maintain comprehensive honesty benchmarks and evaluation frameworks
- Implement search and retrieval-augmented generation (RAG) systems to ground model outputs in verified information
- Design and deploy human feedback collection specifically for identifying and correcting miscalibrated responses
- Design and implement prompting pipelines to generate data that improves model accuracy and honesty
- Develop and test novel RL environments that reward truthful outputs and penalize fabricated claims
Other
- MS/PhD in Computer Science, ML, or related field
- At least a Bachelor's degree in a related field or equivalent experience
- Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time
- Visa sponsorship: We do sponsor visas
- Communication skills