Rubrik is looking to solve the problem of monitoring, governing, and remediating AI agents at scale, by building the Rubrik Agent Cloud platform.
Requirements
- Experience building large-scale distributed systems — especially those that process real-time data, enforce policies, or govern access at scale.
- Background in AI or agentic systems — familiarity with frameworks like LangChain, AutoGen, or CrewAI, and a deep understanding of how agents orchestrate tools and APIs.
- Proficiency in cloud-native architectures — including container orchestration (Kubernetes), microservices, and modern observability stacks.
- Knowledge of enterprise identity, access, and security systems — e.g., Okta, Entra, OAuth, MDMs, EDRs, and data governance platforms.
- Experience with modern AI and LLM infrastructure — including model gateways (like LiteLLM or MCP), fine-tuning, inference optimization, or policy enforcement in AI workloads.
- Strong programming skills in languages like Go, Python, or Java, and familiarity with designing APIs and distributed data pipelines.
- Interest in AI governance, risk, and compliance — and curiosity about how to make AI agents safe, observable, and trustworthy in real-world enterprise environments.
Responsibilities
- Design and implement distributed systems that connect the dots between AI agents, identities, and data — giving enterprises real-time visibility into what agents can access and what they’re doing.
- Build governance and security frameworks that detect and prevent risky agent behaviors.
- Develop the infrastructure that powers “agent observability” — streaming telemetry, audit trails, and behavioral analytics across thousands of agents.
- Architect and scale systems that handle millions of agent decisions per day with low latency and high reliability.
- Design and operate real-time enforcement systems that make it possible to safely deploy AI agents in enterprise environments.
- Create intelligent feedback and evaluation systems to continuously assess and improve agent trustworthiness.
- Design scalable mechanisms for agent discovery and classification.
Other
- BS/MS/PhD in Computer Science or related field — with a strong foundation in systems, distributed computing, or security.
- Desire to push boundaries — staying current with the latest research in agent safety, access control, and AI infrastructure design.
- Ability to work in a collaborative team environment
- Commitment to fostering a culture of inclusion and diversity
- Equal Opportunity Employer/Veterans/Disabled