Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Software Engineer 5 - Offline Inference - Machine Learning Platform

Netflix

$100,000 - $720,000

Sep 22, 2025

Remote, US

Netflix is looking to solve the problem of building a scalable and reliable machine learning platform to accelerate every ML practitioner at the company, with a focus on batch-prediction layer and large-scale batch inference workloads.

Requirements

Hands-on experience with ML engineering or production systems involving training or inference of deep-learning models.
Proven track record of operating scalable infrastructure for ML workloads (batch or online).
Proficiency in one or more modern backend languages (e.g. Python, Java, Scala).
Production experience with containerization & orchestration (Docker, Kubernetes, ECS, etc.) and at least one major cloud provider (AWS preferred).
Deep understanding of real-world ML development workflows and close partnership with ML researchers or modeling engineers.
Familiarity with cloud-based AI/ML services (e.g., SageMaker, Bedrock, Databricks, OpenAI, Vertex) or open-source stacks (Ray, Kubeflow, MLflow).
Experience optimizing inference for large language models, computer-vision pipelines, or other foundation models (e.g., FSDP, tensor/pipeline parallelism, quantization, distillation).

Responsibilities

Build developer-friendly APIs, SDKs, and CLIs that let researchers and engineers—experts and non-experts alike—submit and manage batch inference jobs with minimal effort, particularly in the domain of content and media
Design, implement, and operate distributed services that package, schedule, execute, and monitor batch inference workflows at massive scale.
Instrument the platform for reliability, debuggability, observability, and cost control; define SLOs and share an equitable on-call rotation
Foster a culture of engineering excellence through design reviews, mentorship, and candid, constructive feedback

Other

Excellent written and verbal communication skills; effective collaboration across distributed teams and time zones.
Comfortable working in a team with peers and partners distributed across (US) geographies & time zones.
Commitment to operational best practices—observability, logging, incident response, and on-call excellence.
Bachelor's, Master's, or Ph.D. degree in Computer Science or related field (not explicitly mentioned but implied)
Full-time hourly employees accrue 35 days annually for paid time off to be used for vacation, holidays, and sick paid time off.