Mercor is looking to solve the problem of contributing to a high-impact audio AI research project by authoring prompt–golden answer pairs for training and evaluating language models.
Requirements
- Significant familiarity with ChatGPT or similar tools for personal decision-making or hobbies / general interests
Responsibilities
- Design and Optimize Prompts: Create detailed audio prompts with multiple constraints and instructions.
- Define and Document Evaluation Standards: Establish high-level expectations for correct responses in general audio consumer contexts, and develop comprehensive rubric.
- Conduct Model Testing and Grading: Run prompts through models and assess preliminary outputs against expectations.
- Support Benchmarking and Quality Assurance: Collaborate in QA review processes to ensure prompt tasks and rubrics meet rigor, maintaining consistency and reliability before integration into official benchmarks.
Other
- BS or BA from a reputable institution completed or in progress
- Strong writing and critical thinking skills
- Ability to work independently and meet deadlines
- 2+ years of experience in teaching or research (Preferred Qualification)