xAI is looking to build a supercomputing platform to power its ambitious goals in large-scale AI infrastructure.
Requirements
- Proficient in Golang, Rust, Python, or similar languages.
- Familiarity with modern developer tools (Bazel, Buildkite, Argo, Kubernetes).
- Familiarity with service meshes (Envoy).
- Familiarity with observability frameworks.
- Experience with large-scale storage systems (KV stores, RDBMS, object stores).
- Experience with Kubernetes ecosystem development (controllers, plugins).
- Golang, Python, Rust, gRPC, Kubernetes, Bazel, Buildkite, Argo, Envoy
Responsibilities
- Enhance developer experience by modernizing build systems, reducing build times, and implementing tools like Bazel in a monorepo environment.
- Scale compute infrastructure on Kubernetes by building controllers, admission plugins, and supporting systems that empower teams to leverage Kubernetes effectively.
- Design and maintain one of the largest traffic shaping and load balancing deployments using Envoy, while building service meshes and service discovery systems to handle massive scale.
- Scale data platforms and observability systems (logging, tracing, metrics) to support exabyte-scale data processing and provide deep insights into system performance.
- Drive reliability, standardization, and performance by building and refining systems with a pedantic focus on quality and scalability.
- Manage and optimize large-scale storage systems, including key-value stores, relational databases, and network file systems or object stores (open-source, cloud-managed, and in-house solutions).
- Contribute to miscellaneous quality-of-life improvements that empower developers and streamline workflows.
Other
- 2+ years of industry experience working with large-scale, high-throughput distributed systems, compute platforms, or data infrastructure.
- Passionate about reliability, performance optimization, and building systems that scale seamlessly.
- Strong communication skills.
- Work ethic and strong prioritization skills are important.
- Located near the Bay Area or open to relocation.