Onebrief is looking to solve complex military requirements for planning critical operations by transforming them into robust production-ready systems using AI and ML.
Requirements
- 10+ years in large scale distributed systems, ideally supporting more than 100k concurrent users
- Proven experience designing and deploying AI systems at scale in distributed, production environments
- Strong background in integrating LLMs with retrieval systems for real-world use cases
- Understanding of data governance, model safety evaluations, red-teaming, and secure ML practices for regulated domains
- Experience with building systems with information retrieval (relevance & ranking), natural language processing techniques like Named Entity Recognition (NER)
- Experience with orchestrating systems that combine text, structured data, and domain-specific signals
Responsibilities
- Define system-level standards for model development, evaluation, and deployment, while mentoring senior and staff level engineers through collaboration.
- Drive long term AI/ML architecture at Onebrief.
- Agent Orchestration: Design systems that coordinate multiple AI agents, tools, and workflows to accomplish complex operational tasks.
- Design and implement enterprise-scale AI infrastructure supporting retrieval, generation, and multi-modal reasoning.
- Build and scale graph-based systems for structured representation, reasoning, and integration with generative models.
- Apply LLM practices like RAG & prompt engineering to deliver grounded & reliable outputs for mission planning
- Establish evaluation frameworks and SLO's for RAG quality, agent reliability, and system performance in production
Other
- Partner with product, domain experts, and leadership to translate operational needs into technical roadmaps.
- mentoring senior and staff level engineers through collaboration.
- M.S. in Computer Science, Engineering, or equivalent practical experience
- Background in defense, national security, or other mission-critical domains