Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

RemoteHunter Logo

AI Evaluation Engineer, Device Intelligence

RemoteHunter

$150,000 - $170,000
Dec 1, 2025
Remote, US
Apply Now

The organization needs to develop and implement evaluation strategies for AI systems supporting device intelligence within the life sciences, diagnostics, and biotechnology sectors.

Requirements

  • Experience designing and implementing evaluation methodologies for AI systems, including LLMs and computer vision.
  • Knowledge of metrics for AI performance, robustness, and fairness, especially in regulated domains.
  • Expertise in at least three of the following: benchmarking frameworks, statistical validation, synthetic data generation, adversarial testing, explainability techniques.
  • Proficiency in Python and ML libraries (such as PyTorch, TensorFlow) and familiarity with evaluation tools (such as OpenAI Evals, Dynabench, Promptfoo).
  • Experience with regulatory processes for medical devices and AI/ML-based software as a medical device (SaMD) is a plus.
  • Familiarity with quality management systems and standards relevant to life sciences and diagnostics is a plus.
  • Knowledge of instrument control mechanisms and integration with AI systems is a plus.

Responsibilities

  • Define and execute evaluation strategies for AI products in life sciences, diagnostics, and biotechnology.
  • Design and implement evaluation frameworks for agentic workflows, LLMs, NLP, computer vision, and multimodal models.
  • Develop and conduct evaluation plans to assess performance, reliability, and safety across multimodal datasets.
  • Analyze evaluation results, identify weaknesses, and recommend improvements for AI models and workflows.
  • Build automated pipelines for continuous evaluation and monitoring of AI systems in production.

Other

  • Collaborate with senior leaders and product teams to align evaluation criteria with KPIs and regulatory needs.
  • Ability to communicate complex evaluation results to technical and non-technical stakeholders.
  • Eligible for remote work arrangements.
  • This remote position is available in Europe or the Eastern US.