Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Software Engineer, ML Data Platform

Mirage

$185,000 - $285,000

Oct 17, 2025

New York, NY, US

Mirage is looking for a Software Engineer to build and scale the data systems that power their machine learning products, focusing on data engineering and ML infrastructure to handle large-scale streaming pipelines and ensure reliable, discoverable, and performant feature data.

Requirements

4+ years building distributed data systems, feature platforms, or ML infrastructure at scale.
Strong experience with streaming and batch pipelines (e.g. Pub/Sub, Kafka, Dataflow, Beam, Flink, Spark).
Deep knowledge of cloud-native data stores (e.g. Bigtable, BigQuery, DynamoDB, Snowflake) and schema/versioning best practices.
Proficiency in Python and experience building developer-facing libraries or SDKs.
Experience with Kubernetes, containerized data infrastructure, and workflow orchestration tools (e.g. Airflow, Temporal).
Familiarity with ML workflows and feature store design — enough to partner closely with ML teams.
Bonus: Experience working with video, audio, or other unstructured media data in a production environment.

Responsibilities

Design and scale feature pipelines: Build distributed data processing systems for feature extraction, orchestration, and serving — including real-time streaming, batch ingestion, and CDC workflows.
Feature Extraction: Design and implement reliable, reusable feature pipelines for ML models, ensuring features are accurate, scalable, and production-ready through well-designed SDKs and orchestration tools.
Build and evolve storage infrastructure: Manage multi-tier data systems (e.g. Bigtable for online features/state, BigQuery for analytics and offline training), including schema evolution, versioning, and compatibility.
Own orchestration and reliability: Lead workflow orchestration design (e.g. Pub/Sub, Busboy, Airflow/Temporal), monitoring, and alerting to ensure reliability at 100M+ video scale.
Collaborate with ML teams: Partner with ML engineers on feature availability, dataset curation, and streaming pipelines for training and inference.
Optimize for performance and cost: Tune GPU utilization, resource allocation, and data processing efficiency to maximize system throughput and minimize cost.
Enable analytics and insights: Support downstream analytics and data science workflows by ensuring data accessibility, discoverability, and performance at scale.

Other

All of our roles will require you to be in-person at our NYC HQ (located in Union Square)
We do not work with third-party recruiting agencies, please do not contact us
Comprehensive medical, dental, and vision plans
401K with employer match
Generous PTO policy