Box needs to build and evaluate foundational AI agents that power the Box AI ecosystem, enabling teams to design, deploy, and operate AI agents for real-world enterprise workflows, transforming how organizations manage content and automate processes in an AI-first era.
Requirements
- Strong background in machine learning, information retrieval, or natural language processing.
- Proficiency with at least one programming language such as Python, Java, or Scala.
- Experience designing, training, and evaluating ML models in production.
- Familiarity with retrieval systems, ranking models, RAG pipelines, or intent classification.
- Hands-on experience with LangChain, LangGraph, or other agent frameworks.
- Familiarity with LLMs, embeddings, semantic search, indexing, and relevance optimization.
- Experience with cloud-based ML platforms such as Vertex AI, AWS Bedrock, or SageMaker.
Responsibilities
- Build, evaluate, and evolve foundational agents such as DeepSearch, DeepResearch, Extract, and Compose.
- Develop techniques for intent detection, query understanding, ranking, and RAG to improve accuracy and relevance.
- Define metrics, evaluation pipelines, and benchmarks for agent quality, including precision/recall, factual grounding, and latency trade-offs.
- Research and implement best practices in retrieval, orchestration, and evaluation of multi-agent workflows.
- Collaborate with platform engineers to design core components that enable secure, reliable, and scalable deployment of agents.
- Partner with product teams to translate enterprise use cases into agentic solutions, ensuring measurable improvements in user experience.
- Contribute to technical discussions, share research insights, and help define the roadmap for Box’s agent ecosystem.
Other
- You are passionate about building and evaluating AI agents that solve enterprise problems.
- You enjoy working at the intersection of machine learning and distributed systems, bridging research with production.
- You like to be an owner and strive to do work you’re proud of—both technically and in your team interactions.
- You are collaborative, curious, and comfortable mentoring or learning from other engineers and ML practitioners.
- Boxers are expected to work from their assigned office a minimum of 3 days per week.