Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Swooped Logo

AI Agent Evaluation Analyst

Swooped

Salary not specified
Oct 16, 2025
Remote, US
Apply Now

Our client is seeking QAs for autonomous AI agents to validate and improve complex task structures, policy logic, and agent evaluation frameworks, aiming to unlock the potential of generative AI through global, real-world insights.

Requirements

  • Familiarity with structured data formats: Can read, not necessarily write JSON/YAML.
  • Ability to assess scenarios holistically: What's missing, what's unrealistic, what might break?
  • Experience with policy evaluation, logic puzzles, case studies, or structured scenario design.
  • Background in consulting, academia, olympiads (e.g. logic/math/informatics), or research.
  • Exposure to LLMs, prompt engineering, or AI-generated content.
  • Familiarity with QA or test-case thinking (edge cases, failure modes, “what could go wrong”).
  • Some understanding of how scoring or evaluation works in agent testing (precision, coverage, etc.).

Responsibilities

  • Reviewing evaluation tasks and scenarios for logic, completeness, and realism.
  • Identifying inconsistencies, missing assumptions, or unclear decision points.
  • Helping define clear expected behaviors (gold standards) for AI agents.
  • Annotating cause-effect relationships, reasoning paths, and plausible alternatives.
  • Thinking through complex systems and policies as a human would to ensure agents are tested properly.
  • Working closely with QA, writers, or developers to suggest refinements or edge case coverage.

Other

  • Curious, analytical, and proactive contributors who are comfortable with ambiguity and eager to learn how modern AI systems are evaluated.
  • Excellent analytical thinking: Can reason about complex systems, scenarios, and logical implications.
  • Strong attention to detail: Can spot contradictions, ambiguities, and vague requirements.
  • Good communication and clear writing (in English) to document your findings.
  • This flexible, remote, project-based opportunity is ideal for analysts, researchers, consultants, or advanced students looking for intellectually engaging, part-time, and non-permanent work.