MagicSchool is looking to expand its Generative AI Evaluations capabilities to ensure the safest product for its users, with a focus on safety, quality, speed, and user impact.
Requirements
- Working knowledge of SQL (PostgreSQL) and database design / data modeling
- Expert knowledge of Python
- Modern Python features like typehints and asyncio
- Python Multiprocessing to support evaluation jobs in the 10s to 100s of thousands per run.
- Familiarity with Pydantic
- Experience working with and deploying docker applications
- Experience with evaluating generative AI systems
Responsibilities
- Ensure we're building the safest product MagicSchool can build.
- Build tools and processes for evaluating MagicSchool output at scale
- Design, architect, and write high quality code to expand our Evaluations framework's capabilities
- Debug complex code and applications in a cloud environment
- Data modeling in a relational database, data warehouse, and object stores.
- Assess and help the team choose appropriate third party solutions to problems when building doesn't make sense
- Work cross-functionally to integrate MagicSchool's evaluations platform with the larger application
Other
- 7+ years as an engineer, 2 of which were at senior level or above
- Start up experience
- Experience with LLMs and Generative AI APIs and systems
- Strong communication skills: team-first mindset, highly collaborative, can articulate decisions within team's context
- Builds relationships easily: emotionally intelligent, communication, warm