XBOW is building an AI-powered platform to autonomously discover, validate, and exploit vulnerabilities, aiming to put security ahead in the arms race against attackers who are already using AI. The company needs to design and implement complex distributed infrastructure to power its core AI engine and distributed analysis systems, enabling seamless operation across multiple cloud providers and contexts.
Requirements
- Strong experience building and operating scalable, distributed systems on cloud infrastructure such as AWS or similar.
- Comfortable working with infrastructure as code (e.g., Terraform, CDK) and see infra as part of the engineering system—not something separate from it.
- A track record of performance tuning across cloud services, databases, and compute layers
- Eager to learn new tools, languages, and technologies as needed
- Strong problem-solving skills and the ability to work with incomplete information
- Curious, practical, and eager to work across layers of the stack when needed
- You think proactively about failure modes and bring experience implementing disaster recovery and business continuity plans that keep critical systems running.
Responsibilities
- Design and implement infrastructure systems that scale reliably and securely, and can be deployed across multiple cloud environments (AWS, Azure, OCI etc.) and contexts (SaaS, on prem).
- Tune and optimize cloud services across compute, storage, networking, and observability to drive performance, reliability and maintainability of core services.
- Develop our core services, written in TypeScript, Kotlin and Go (and pick them up quickly if you haven’t used them before) to support our unique deployment and infrastructure requirements.
- Support large-scale systems with event driven architectures.
- Own problems end-to-end—from design through deployment to production support
- Navigate ambiguity and help define how we build as much as what we build
- Design for resilience by implementing disaster recovery and business continuity strategies that ensure uptime, even when things break
Other
- A thoughtful communicator who values clarity and simplicity and is comfortable working in a fast-paced startup and navigating ambiguity
- Partner closely with other engineers, AI researchers and Security researchers to enable high-quality, high-velocity product development
- Experience working in an early stage startup
- Remote (all team members are remote but we meet regularly and you’re supported to travel to collaborate with colleagues in person)
- We believe in people who are driven by curiosity and a willingness to learn.