Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

CoreWeave Logo

Senior Software Engineer II, AI/ML

CoreWeave

$165,000 - $242,000
Sep 26, 2025
Sunnyvale, CA, US
Apply Now

CoreWeave is looking to solve the business and technical problem of delivering a cloud platform of cutting-edge services powering the next wave of AI, by providing enterprises and leading AI labs with the most performant, efficient, and resilient solutions for accelerated computing.

Requirements

  • Strong coding in Python or Go (C++ a plus) and deep familiarity with networked systems and performance.
  • Hands-on experience with Kubernetes at production scale, CI/CD, and observability stacks (Prometheus, Grafana, OpenTelemetry).
  • Practical knowledge of inference internals: batching, caching, mixed precision (BF16/FP8), streaming token delivery.
  • Proven track record improving tail latency (P95/P99) and service reliability through metrics-driven work.
  • Contributions to inference frameworks (vLLM, Triton, TensorRT-LLM, Ray Serve, TorchServe).
  • Experience with CUDA kernels, NCCL/SHARP, RDMA/NUMA, or GPU interconnect topologies.

Responsibilities

  • Lead design reviews and drive architecture within the team; decompose multi-service work into clear milestones.
  • Define and own SLIs/SLOs; ensure post-incident actions land and reliability improves release-over-release.
  • Implement advanced optimizations (e.g., micro-batch schedulers, speculative decoding, KV-cache reuse) and quantify impact.
  • Strengthen incident posture: capacity planning, autoscaling policy, graceful degradation, rollback/traffic-shift strategies.
  • Mentor IC1/IC2 engineers; review cross-team designs and elevate coding/testing standards.
  • Own an area spanning multiple services and teams (e.g., request routing & adaptive scheduling, cost-per-token analytics, GPU resource isolation).
  • Partner with product, orchestration, and hardware teams to evolve our Kubernetes-native inference platform and meet strict P99 SLAs at scale.

Other

  • ~5–8 years industry experience building distributed systems or cloud services.
  • Leading multi-team initiatives or partnering with customers on mission-critical launches.
  • CoreWeave is an equal opportunity employer, committed to fostering an inclusive and supportive workplace.
  • This position requires access to export controlled information.
  • The base salary range for this role is $165,000 to $242,000.