Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Senior MLOps Engineer – LLMOps

TRM Labs

$215,000 - $230,000

Nov 24, 2025

Remote, US

TRM Labs is looking to build and scale the technical infrastructure for AI/ML systems, with a specific focus on enabling next-generation AI applications, Large Language Models (LLMs), and agentic systems. The goal is to build robust pipelines, high-performance infrastructure, and operational tooling for fast, safe, and scalable AI deployment.

Requirements

Write high-quality, maintainable software — primarily in Python, but we value engineering ability over language familiarity.
Have a strong background in scalable infrastructure, including: Containerization and orchestration (e.g. Docker, Kubernetes)
Infrastructure-as-code and deployment (e.g. Terraform, CI/CD pipelines)
Monitoring and logging frameworks (e.g. Datadog, Prometheus, OpenTelemetry)
Understand and implement ML Ops best practices, including: Model versioning and rollback strategies
Automated evaluation and drift detection
Scalable model and agent serving infrastructure (e.g. vLLM, Triton, BentoML)

Responsibilities

Build reusable CI/CD workflows for model training, evaluation, and deployment — integrating Langfuse, GitHub Actions, and experiment tracking, etc.
Automate model versioning, approval workflows, and compliance checks across environments.
Build out a modular and scalable AI infrastructure stack — including vector databases, feature stores, model registries, and observability tooling.
Partner with engineering and data science to embed AI models and agents into real-time applications and workflows.
Continuously evaluate and integrate state-of-the-art AI tools (e.g. LangChain, LlamaIndex, vLLM, MLflow, BentoML, etc.).
Drive AI reliability and governance, enabling experimentation while ensuring compliance, security, and uptime.
Deploy infrastructure to support offline and online evaluation of LLMs and agents — including regression testing, cost monitoring, and human-in-the-loop workflows.

Other

Demonstrate strong ownership and pragmatism, balancing infrastructure elegance with iterative delivery and measurable impact.
Rapid Issue Resolution. TRM Engineers identify and resolve critical onsite issues in minutes to hours, not weeks.
Navigating Bureaucracy. We anticipate and address procedural hurdles, build trust with key stakeholders, and find alternative pathways to approvals.
Efficient Knowledge Transfer. Engineers document and share updates in real time, ensuring the entire team—onsite and remote—has full visibility into plans, blockers, and resolutions.
TRM may not be the right fit for everyone. If you're optimizing for work life balance, we encourage you to: Ask your interviewers how they personally approach balance within their teams, and Reflect on whether this is the right season in your life to join a high-velocity environment.