TrueFoundry is looking to scale its Enterprise Outcomes motion by hiring a senior leader to build and lead the engineering arm of this motion, focusing on delivering domain-specific solutions that drive business transformation and shape the product roadmap.
Requirements
- 5–8 years of software engineering with substantial experience building distributed systems, infra, or ML platforms.
- Deep practical experience integrating and deploying LLMs in production (RAG, retrieval, embeddings pipelines).
- Hands-on experience with agent orchestration frameworks (LangGraph / LangChain or custom agent runtimes) and stateful workflow design.
- Strong systems knowledge: Kubernetes, container orchestration, service meshes, and performance tuning.
- Proven track record building observability, cost controls, and policy enforcement for production services.
- Experience building or contributing to open-source LLM orchestration tools (LangGraph, LangChain, or similar).
- Familiarity with enterprise constraints: on-prem/cloud hybrid deployments, data residency, compliance requirements.
Responsibilities
- Architect and implement scalable agent orchestration patterns (graph-based executors, state management, multi-agent coordination) for production workloads.
- Own critical integrations: model adapters, LLM gateway hooks, vector DBs, tools & external APIs, and the platform’s LLMops flows.
- Build and improve tracing, benchmarking and observability for LLMs and agents — token/cost accounting, latency p95, throughput, and correctness checks.
- Drive design for safety/guardrails: moderation hooks, human-in-the-loop checkpoints, replayable audit trails and policy enforcement.
- Mentor junior engineers, run design reviews, and improve engineering practices (testing, CI/CD, chaos testing for agents).
- Work directly with strategic customers to prototype complex agentic solutions and translate them into product features.
Other
- Demonstrated leadership in cross-functional projects and direct customer engagement.
- BS/MS/PhD in CS or related field (or equivalent).
- Open-source contributions, architecture blogs, or public talks on agentic LLMs or LLMops.
- Examples of productizing research or shipping complex infra features.