Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Datadog Logo

Staff Software Engineer - ML Observability

Datadog

Salary not specified
Aug 29, 2025
New York, NY, US
Apply Now

Datadog's ML Observability team is building tools to monitor, explain, and improve AI systems in production, specifically focusing on Large Language Models (LLMs) and generative AI. The goal is to provide robust, scalable observability for AI workloads, enabling customers to deploy AI with confidence.

Requirements

  • Deep understanding of distributed systems and scalable backend architectures
  • Hands-on experience building and shipping LLM-powered or GenAI applications.
  • Understanding of model internals, inference pipelines, evaluation techniques, and prompt engineering
  • Experience with observability tools/platforms

Responsibilities

  • Drive design and implementation of LLM observability features.
  • Ideate, prototype, and scale new product features to provide insights and drive improvements for generative AI systems
  • Develop and extend tools for tracing, evaluating, and debugging LLMs
  • Influence architecture decisions and mentor engineers to build resilient, high-performance systems
  • Stay current with industry trends and advancements in machine learning and observability, driving innovation within the team.

Other

  • Work cross-functionally with other eng teams, product, UX, and applied science to iterate fast and find product-market fit
  • Stay close to customer pain points and use those insights to guide product and engineering priorities
  • You have a BS/MS/PhD in a Computer Science, Engineering or related scientific field or equivalent experience
  • Ability to thrive in ambiguous, fast-changing spaces and have a product-oriented mindset
  • Communicate clearly, think rigorously, and take pride in clean, maintainable code