Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Senior Software Engineer, GenAI & ML Evaluation Frameworks - Grafana Ops, AI/ML (Remote, USA)

Grafana Labs

$148,505 - $178,206

Dec 9, 2025

Remote, US

The company is looking to build and evolve internal evaluation frameworks for Generative AI systems, particularly Large Language Models, to help users make sense of complex observability data through AI-driven features.

Requirements

Experience designing and implementing evaluation frameworks for AI/ML systems
Familiarity with prompt engineering, structured output evaluation, and context-window management in LLM systems
High autonomy to collaborate and translate team goals into clear, testable criteria supported by effective tooling
Experience working in environments with rapid iteration and experimental development
Familiarity with CI/CD workflows and automated testing

Responsibilities

Design and implement robust evaluation frameworks for GenAI and LLM-based systems
Develop tooling to enable automated, low-friction evaluation of model outputs, prompts, and agent behaviors
Define and refine metrics for both structure and semantics, ensuring alignment with realistic use cases and operational constraints
Lead the development of dataset management processes and guide teams across Grafana in best practices for GenAI evaluation

Other

Passion for minimizing human toil and building AI systems that actively support engineers
Pragmatic mindset that values reproducibility, developer experience, and thoughtful trade-offs when scaling GenAI systems
Experience working in a remote environment, USA time zones only
Bachelor's degree or higher in Computer Science or related field (not explicitly mentioned but implied)
Travel requirements not mentioned, but may be required for company events or meetings