Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Machine Learning Engineer, AWS Neuron Inference, Annapurna ML

Amazon Web Services (AWS)

$129,300 - $223,600

Dec 23, 2025

Seattle, WA, US

AWS Neuron is looking to develop, enable and performance tune building blocks for all key ML model families, including Llama3, GPT OSS, Qwen3, DeepSeek and beyond, to solve the problem of creating high-performance distributed inference solutions for the latest generation Trainium accelerators.

Requirements

Experience optimizing LLM inference performance with kernels, Python, PyTorch or JAX
Experience programming with at least one software programming language
Experience with design or architecture (design patterns, reliability and scaling) of new and existing systems
Experience with full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations
Experience with compiler runtime engineering
Experience with distributed inference solutions
Experience with machine learning accelerators

Responsibilities

Develop optimized building blocks for the Neuron distributed inference library, tuning them to ensure highest performance and maximize efficiency running on Trn2 and Trn3 servers
Create metrics, implement automation and other improvements, and resolve the root cause of software defects
Participate in design discussions, code review, and communicate with internal and external stakeholders
Work cross-functionally with teams across Neufon in a fast-paced startup-like development environment
Develop technology components
Optimize LLM inference performance with kernels, Python, PyTorch or JAX
Build and tune high-performance distributed inference solutions for the latest generation Trainium accelerators

Other

3+ years of non-internship professional software development experience
2+ years of non-internship design or architecture experience
Bachelor's degree in computer science or equivalent
Ability to work in a fast-paced startup-like development environment
Ability to communicate with internal and external stakeholders