TrueFoundry is looking to solve the business problem of accelerating the development, deployment, and scaling of GenAI and ML applications with security, cost efficiency, and cross-cloud flexibility by embedding modern LLM applications into customer workloads and platform features.
Requirements
- 2–3 years software engineering experience building backend services or ML infra; comfortable with Python (and one other language).
- Practical experience using LLMs (OpenAI/Anthropic/other) and building prompt + retrieval workflows.
- Familiarity with at least one vector DB (e.g., Chroma, Pinecone, Weaviate) and embeddings pipelines.
- Experience with REST/gRPC APIs, containers (Docker), and basic Kubernetes concepts.
- Strong debugging skills and ability to write clean, testable code.
- Hands-on with LangChain or LangGraph and agent architectures.
- Experience with RAG evaluation, prompt engineering best practices, or prompt-testing frameworks.
Responsibilities
- Implement, test, and maintain LLM-powered features and AI Agent / RAG pipelines (prompting, retrieval, vector DB + embeddings).
- Build and extend agent workflows using LangGraph / LangChain or equivalent frameworks; help harden state persistence and retry logic.
- Integrate models and runtimes via the platform’s API (deploy/serve/instrument LLMs, configure token/cost guards).
- Write end-to-end tests, small services, and automation to reproduce customer issues and demo solutions.
- Instrument observability: logs, traces, latency/cost dashboards and basic alerting for LLM workloads.
- Collaborate with product, support, and customers to convert POCs into documented, repeatable patterns.
Other
- BS/MS in CS or related field (or equivalent industry experience).
- Public repo or demo showing an LLM project, small agent, or RAG pipeline.
- Curiosity about LLM safety, reliability, and cost-efficient deployment.