Mercor is looking for experienced Python Engineers to improve AI systems by developing and validating coding benchmarks that mirror real-world development challenges across various languages and domains.
Requirements
- Strong proficiency in Python
- Experience with debugging, testing, and validating code
Responsibilities
- Develop and validate coding benchmarks in Python by curating issues, solutions, and test suites from real-world repositories
- Ensure benchmark tasks include comprehensive unit and integration tests for solution verification
- Maintain consistency and scalability of benchmark task distribution
- Provide structured feedback on solution quality and clarity
- Debug, optimize, and document benchmark code for reliability and reproducibility
Other
- 3–10 years of experience as a backend software engineer, ML engineer, or applied data scientist
- Degree in Software Engineering, Computer Science, or a related field
- Comfortable with technical writing and attention to detail
- Independent contractor
- Part-time (15–20 hours/week)