Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Xai Logo

Software Engineer - Infrastructure - X

Xai

$180,000 - $440,000
Oct 9, 2025
Palo Alto, CA, US
Apply Now

xAI is looking to build a supercomputing platform to power its ambitious goals in large-scale AI infrastructure.

Requirements

  • Proficient in Golang, Rust, Python, or similar languages.
  • Familiarity with modern developer tools (Bazel, Buildkite, Argo, Kubernetes).
  • Familiarity with service meshes (Envoy).
  • Familiarity with observability frameworks.
  • Experience with large-scale storage systems (KV stores, RDBMS, object stores).
  • Experience with Kubernetes ecosystem development (controllers, plugins).
  • Golang, Python, Rust, gRPC, Kubernetes, Bazel, Buildkite, Argo, Envoy

Responsibilities

  • Enhance developer experience by modernizing build systems, reducing build times, and implementing tools like Bazel in a monorepo environment.
  • Scale compute infrastructure on Kubernetes by building controllers, admission plugins, and supporting systems that empower teams to leverage Kubernetes effectively.
  • Design and maintain one of the largest traffic shaping and load balancing deployments using Envoy, while building service meshes and service discovery systems to handle massive scale.
  • Scale data platforms and observability systems (logging, tracing, metrics) to support exabyte-scale data processing and provide deep insights into system performance.
  • Drive reliability, standardization, and performance by building and refining systems with a pedantic focus on quality and scalability.
  • Manage and optimize large-scale storage systems, including key-value stores, relational databases, and network file systems or object stores (open-source, cloud-managed, and in-house solutions).
  • Contribute to miscellaneous quality-of-life improvements that empower developers and streamline workflows.

Other

  • 2+ years of industry experience working with large-scale, high-throughput distributed systems, compute platforms, or data infrastructure.
  • Passionate about reliability, performance optimization, and building systems that scale seamlessly.
  • Strong communication skills.
  • Work ethic and strong prioritization skills are important.
  • Located near the Bay Area or open to relocation.