Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

pony.ai Logo

(Senior) Software Engineer, Infrastructure (Kubernetes Platform)

pony.ai

$120,000 - $240,000
Sep 15, 2025
Fremont, CA, US
Apply Now

Pony.ai is looking to hire a (Senior) Kubernetes Engineer to design, operate, and optimize Kubernetes clusters across hybrid cloud environments to support diverse workloads including large-scale model training and low-latency inference services.

Requirements

  • 3+ years of hands-on experience managing Kubernetes clusters in production (EKS/GKE/AKS and/or bare-metal).
  • Strong Linux systems background and distributed systems fundamentals (scheduling, reliability, scaling).
  • Proven experience with hybrid cloud environments (AWS, GCP, Azure, and on-prem).
  • Expertise in containerization (Docker) and Infrastructure-as-Code tools (Terraform, Helm, Ansible, or similar).
  • Experience developing and maintaining Kubernetes platform features (operators, CRDs, APIs).
  • Solid knowledge of Kubernetes networking (CNI, ingress, service discovery), storage, and compute integrations.
  • Strong understanding of security best practices (RBAC, network policies, secrets).

Responsibilities

  • Design, operate, and optimize Kubernetes clusters across hybrid cloud environments (public cloud and on-prem datacenter).
  • Support diverse workloads including large-scale model training and low-latency inference services.
  • Develop, maintain, and extend Kubernetes platform features (operators, CRDs, APIs) to automate and productize internal use cases.
  • Own cluster lifecycle management including upgrades, patching, configuration, and governance.
  • Define and enforce best practices for service deployments, security policies, and operational guidelines.
  • Contribute to observability and SRE practices to ensure reliability at scale (SLOs, incident reviews, metrics-driven improvements).
  • Collaborate with storage, compute, and networking teams (CNI, ingress, service discovery) to enhance automation, availability, and performance.

Other

  • Bachelor’s degree in Computer Science, Engineering, or related field, or equivalent experience.
  • Effective communication skills and ability to work cross-functionally in a fast-paced environment.
  • Provide technical mentorship, documentation, and on-call support for cluster-related incidents.