Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

AI / Machine Learning Engineer

Top Talent HQ

Salary not specified

Aug 26, 2025

Seattle, WA, US

Design, build, deploy, and maintain the robust and scalable infrastructure that powers cutting-edge artificial intelligence (AI) and machine learning (ML) initiatives.

Requirements

Strong programming skills in Python, C++, Go, or Rust for systems development and automation.
Ability to design end-to-end systems that balance performance, reliability, security, and cost.
Hands-on experience with ML training frameworks (PyTorch, TensorFlow, JAX) at scale.
Knowledge of hardware-level optimization: CUDA, ROCm, kernel bypass, FPGA/ASIC integration.
Experience with Heterogeneous Computing for AI, Bigdata, HPC.
Open-source contributions or patents in the ML systems space.

Responsibilities

Lead end-to-end design of scalable, reliable AI infrastructure (AI accelerators, compute clusters, storage, networking) for training and serving large ML workloads.
Define and implement service-oriented, containerized architectures (Kubernetes, VM frameworks, unikernels) optimized for ML performance and security.
Profile and optimize every layer of the ML stack—ML Compiler, GPU/TPU scheduling, NCCL/RDMA networking, data preprocessing, and training/inference frameworks.
Develop low-overhead telemetry and benchmarking frameworks to identify and eliminate bottlenecks in distributed training and serving.
Build and operate large-scale deployment and orchestration systems that auto-scale across multiple data centers (on-premises and cloud).
Architect and implement robust ETL and data ingestion pipelines (Spark/Beam/Dask/Flume) tailored for petabyte-scale ML datasets.
Integrate experiment management and workflow orchestration tools (Airflow, Kubeflow, Metaflow) to streamline research-to-production.

Other

Master's degree (PhD's degree is preferred) in Computer Science, Engineering, or a related technical field.
5+ years in infrastructure or systems engineering focused roles, with at least 2 years focused on ML/AI infrastructure.
Excellent communicator able to bridge research and production teams.
Strong problem-solving aptitude and a drive to push the state of the art in ML infrastructure.
Publications in top tier ML or System Conferences such as MLSys, ICML, ICLR, KDD, NeurIPS (Preferred)