Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Senior AI / MLOps Engineer – e-Commerce Search & Information Retrieval

Algolia

$200,000 - $350,000

Dec 10, 2025

Seattle, WA, US

At Algolia, the business problem is to build the next generation of AI powered search products, making AI explainable and helping customers make data driven decisions, by turning prototypes into robust, scalable, and observable AI services.

Requirements

Strong coding skills in Python (preferred) and at least one statically typed language (Go preferred).
Hands-on expertise with containerization (Docker), orchestration (Kubernetes/EKS/GKE/AKS), and cloud platforms (AWS, GCP, or Azure).
Proven record of building CI/CD pipelines and automated testing frameworks for data or ML workloads.
Deep understanding of REST/gRPC APIs, message queues (Kafka, Kinesis, Pub/Sub), and stream/batch data processing frameworks (Spark, Flink, Beam).
Experience implementing monitoring, alerting, and logging for mission-critical services.
Familiarity with common ML lifecycle tools (MLflow, Kubeflow, SageMaker, Vertex AI, Feature Store, etc.).
Working knowledge of ML concepts such as feature engineering, model evaluation, A/B testing, and drift detection.

Responsibilities

Productionization & Packaging: Convert notebooks and research codebase into production-ready Python and Go micro-services, libraries, or kubeflow pipelines, and design reproducible build pipelines (Docker, Conda, Poetry) and manage artefacts in centralized registries.
Scalable Deployment: Orchestrate real-time and batch inference workloads on Kubernetes, AWS/GCP managed services, or similar platforms, ensuring low latency and high throughput, and Implement blue-green / canary rollouts, automatic rollback, and model versioning strategies (SageMaker, Vertex AI, KServe, MLflow, BentoML, etc.).
MLOps & CI/CD: Build and maintain CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins, Argo) covering unit, integration, data-quality, and performance tests, and Automate feature store updates, model retraining triggers, and scheduled batch jobs using Airflow, Dagster, or similar orchestration tools.
Observability & Reliability: Define and monitor SLIs/SLOs for model latency, throughput, accuracy, drift, and cost, and Integrate logging, tracing, and metrics (Datadog etc.) and establish alerting & on-call practices.
Data & Feature Engineering: Collaborate with data engineers to create scalable pipelines that ingest clickstream logs, catalog metadata, images, and user signals, and Implement real-time and offline feature extraction, validation, and lineage tracking.
Performance & Cost Optimization: Profile models and services; leverage hardware acceleration (GPU, TPU), libraries (ONNX, OpenVINO), and caching strategies (Redis, Faiss) to meet aggressive latency targets, and Right-size clusters and workloads to balance performance with cloud spend.
Governance & Compliance: Embed security, privacy, and responsible-AI checks in pipelines; manage secrets, IAM roles, and data-access controls via Terraform or CloudFormation, and Ensure auditability and reproducibility through comprehensive documentation and artifact tracking.

Other

Spend 1-2 days per week in a local coworking space to collaborate with your teammates in person.
5+ years of experience in software engineering with 2+ years focused on deploying ML/AI systems at scale.
Ability to receive and give constructive feedback.
Genuine care about other team members, our clients and the decisions we make in the company.
Aptitude for learning from others, putting ego aside.