Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Amazon.com Logo

Senior Software Development Engineer - AI/ML, AWS Neuron, Multimodal Inference

Amazon.com

$151,300 - $261,500
Dec 9, 2025
Seattle, WA, US
Apply Now

Amazon Web Services (AWS) is looking to accelerate deep learning and GenAI workloads on their custom machine learning accelerators, Inferentia and Trainium, by optimizing the AWS Neuron SDK. This involves improving performance, enabling a wide range of models, and supporting novel architectures.

Requirements

  • Fundamentals of Machine learning and LLMs, their architecture, training and inference lifecycles along with work experience on optimizations for improving the model execution.
  • Software development experience in C++, Python (experience in at least one language is required).
  • Strong understanding of system performance, memory management, and parallel computing principles.
  • Proficiency in debugging, profiling, and implementing best software engineering practices in large-scale systems.
  • Familiarity with PyTorch, JIT compilation, and AOT tracing.
  • Familiarity with CUDA kernels or equivalent ML or low-level kernels.
  • Candidates with performant kernel development such as CUTLASS, FlashInfer etc., would be well suited.

Responsibilities

  • Design, develop, and optimize machine learning models and frameworks for deployment on custom ML hardware accelerators.
  • Participate in all stages of the ML system development lifecycle including distributed computing based architecture design, implementation, performance profiling, hardware-specific optimizations, testing and production deployment.
  • Build infrastructure to systematically analyze and onboard multiple models with diverse architecture.
  • Design and implement high-performance kernels and features for ML operations, leveraging the Neuron architecture and programming models
  • Analyze and optimize system-level performance across multiple generations of Neuron hardware
  • Conduct detailed performance analysis using profiling tools to identify and resolve bottlenecks
  • Implement optimizations such as fusion, sharding, tiling, and scheduling

Other

  • 5+ years of non-internship professional software development experience
  • 5+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
  • work safely and cooperatively with other employees, supervisors, and staff
  • adhere to standards of excellence despite stressful conditions
  • communicate effectively and respectfully with employees, supervisors, and staff to ensure exceptional customer service