Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Cisco Logo

Principal AI Infrastructure Abstraction Engineer

Cisco

Salary not specified
Aug 30, 2025
Milpitas, CA, USA • San Jose, CA, USA
Apply Now

Transform how enterprises harness AI by rethinking systems from the ground up and delivering breakthrough solutions that redefine what's possible — faster, leaner, and smarter.

Requirements

  • Experience building scalable, production-grade infrastructure components or control planes using Go, Python, and C++.
  • Experience with Kubernetes, Docker or Kubevirt for virtualization, containerization, and orchestration frameworks
  • Experience designing or implementing logical resource abstractions for compute, storage, or networking with a focus in multi-tenant environments.
  • Experience integrating with AI/ML platforms or pipelines (e.g., PyTorch, TensorFlow, Triton Inference Server, MLFlow).
  • Experience with GPU sharing, scheduling, or isolation techniques (e.g., MPS, MIG, time-slicing, device plugin frameworks, or vGPU technologies).
  • Solid grasp of resource management concepts including quotas, fairness, prioritization, and elasticity.

Responsibilities

  • Design and implement infrastructure abstractions that cleanly separate logical compute units (vGPUs, GPU pods, AI queues) from physical hardware (nodes, devices, interconnects).
  • Develop runtime services, APIs, and control planes to expose GPU and accelerator resources to users and frameworks with multi-tenant isolation and QoS guarantees.
  • Architect systems for secure GPU sharing, including time-slicing, memory partitioning, and namespace isolation across tenants or jobs.
  • Collaborate with platform, orchestration, and scheduling teams to map logical resources to physical devices based on utilization, priority, and topology.
  • Define and enforce resource usage policies, including fair sharing, quota management, and oversubscription strategies.
  • Integrate with model training and serving frameworks (e.g., PyTorch, TensorFlow, Triton) to ensure smooth and predictable resource consumption.
  • Build observability and telemetry pipelines to trace logical-to-physical mappings, usage patterns, and performance anomalies.

Other

  • This position requires a hybrid working schedule in the San Jose or Milpitas office.
  • Bachelors + 15 years of related experience, or Masters + 12 years of related experience, or PhD + 8 years of related experience