Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Genesis AI Logo

Staff Software Engineer, Inference (Bay Area / Paris / Remote)

Genesis AI

Salary not specified
Sep 9, 2025
San Carlos, CA, US
Apply Now

What You’ll Do

  • Build low-latency inference pipelines for on-device deployment, enabling real-time next-token and diffusion-based control loops in robotics
  • Design and optimize distributed inference systems on GPU clusters, pushing throughput with large-batch serving and efficient resource utilization
  • Implement efficient low-level code (CUDA, Triton, custom kernels) and integrate it seamlessly into high-level frameworks
  • Optimize workloads for both throughput (batching, scheduling, quantization) and latency (caching, memory management, graph compilation)
  • Develop monitoring and debugging tools to guarantee reliability, determinism, and rapid diagnosis of regressions across both stacks

What You’ll Bring

  • Deep experience in distributed systems, ML infrastructure, or high-performance serving (8+ years)
  • Production-grade expertise in Python, with strong background in systems languages (C++/Rust/Go)
  • Low-level performance mastery: CUDA, Triton, kernel optimization, quantization, memory and compute scheduling
  • Proven track record scaling inference workloads in both throughput-oriented cluster environments and latency-critical on-device deployments
  • System-level mindset with a history of tuning hardware–software interactions for maximum efficiency, throughput, and responsiveness