Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Engineering Manager II, AI Model Foundations

Box

Salary not specified

Sep 5, 2025

Redwood City, CA, US

Box needs to rigorously evaluate Large Language Models (LLMs) and Box AI Agents for enterprise-grade quality, reliability, and trust to transform how organizations work with content and empower customers to transform workflows.

Requirements

7+ years in machine learning or applied AI, including 2+ years managing engineers with a track record of coaching, hiring, and performance development.
Practical experience evaluating and/or deploying ML systems at scale; you've designed metrics, datasets, or pipelines that informed product decisions.
Strong analytical problem solver who works confidently with large, complex datasets and ambiguous problem spaces.
Proficient in at least one programming language (e.g., Python, C++, Java, or R) and familiar with modern ML frameworks (e.g., PyTorch, TensorFlow, scikit-learn, NumPy, pandas).
Experience with LLMs and RAG
Depth in IR/NLP/query understanding
Familiarity with Vertex AI, AWS Bedrock/SageMaker; exposure to Kubernetes-based systems.

Responsibilities

Lead and mentor a team of ML engineers to design, build, and operationalize an evaluation framework for Box AI Agents and foundational LLMs.
Define representative enterprise datasets and metrics; develop grading approaches that assess accuracy, safety, grounding, and usability.
Pioneer and evangelize LLM evaluation methodology tailored to enterprise content management use cases.
Collaborate with AI Platform teams to translate evaluation results into roadmap decisions and measurable agent improvements.
Partner with model providers (e.g., OpenAI, Google, Anthropic) to share findings and influence model capabilities for enterprise needs.
Track AI research and industry trends to continuously evolve our evaluation strategy and tooling.
Manage and coordinate the team's on-call rotation to ensure timely and effective incident response, actively participate in escalated on-call incidents to provide leadership and support, and drive improvements by addressing recurring issues to minimize disruptions.

Other

We are an AI-first company. This means you approach your work with a growth mindset and find ways to leverage AI to help make faster, smarter decisions that will 10X your impact at Box.
Excellent communicator who collaborates effectively across product, research, and platform teams and with external partners.
Work with senior leadership (CEO, CTO) to set priorities and a clear one-year roadmap for Model Foundations.
Preferred: Proven roadmap planning where short-term wins ladder into a long-term vision.
Boxers are expected to work from their assigned office a minimum of 3 days per week.