Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Fal Logo

Staff Software Engineer, Compute

Fal

$180,000 - $250,000
Sep 3, 2025
San Francisco, CA, US
Apply Now

The company is looking for an experienced software engineer to build and maintain large-scale computation platforms, focusing on backend systems that efficiently orchestrate workloads, route requests, and manage resources while ensuring reliability and scalability with minimal operational load.

Requirements

  • Deep experience building distributed compute platforms, preferably with Python
  • Strong foundation in managing both cloud and bare metal infrastructure
  • Solid understanding of K8s and CI/CD on it
  • Deep expertise in backend systems that orchestrate workloads and route requests efficiently, while taking care of capacity and resource constraints
  • Strong understanding of foundational cloud infrastructure and Linux provisioning and management tools
  • Know how to achieve reliability and scale with minimum operational load

Responsibilities

  • Develop and maintain our core Python platform, which handles routing of requests, orchestration of AI workloads, GPU server capacity management, observability, authentication, rate limiting, and many others
  • Develop and maintain our infrastructure layer where we use Terraform, Ansible, and provider APIs to manage our fleet of GPU workers
  • Own K8s, FluxCD, Nomad, Prometheus, Thanos, Grafana, Loki, distributed networking storage, and other technologies that underpin our platform
  • Create the vision and lay the foundation for where our infrastructure should go in the next 1/2/5 years

Other

  • Excellent communication
  • Self-starter who executes quickly, takes ownership and constantly seeks improvement
  • We offer visa sponsorship and will help you relocate to San Francisco.