Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Analog Devices Logo

Senior Engineer, AI/ML Software

Analog Devices

$108,800 - $149,600
Nov 6, 2025
Wilmington, MA, US
Apply Now

Analog Devices (ADI) aims to deliver a world-class AI/ML developer experience for its software engineers and data scientists by establishing a global XOps team. This role is crucial for designing and optimizing complete systems, resolving technical issues, and leading the development of major ML/AI operational features to enhance the developer experience across infrastructure, pipelines, deployment, monitoring, governance, and cost/risk optimization.

Requirements

  • Expert in infrastructure-as-code and GitOps practices, with demonstrable skills in Terraform, AWS CDK (Python), Argo CD and/or other IaC and CI/CD systems.
  • Hands-on experience managing Kubernetes clusters (for ML workloads) and designing/implementing ML workflow orchestration solutions and data pipelines (e.g., Argo, Kubeflow, Airflow).
  • Solid understanding of foundation models (LLMs) and their applications in enterprise ML/AI solutions.
  • Strong background in AWS DevOps practices and cloud architecture — e.g., AWS services such as Bedrock, SageMaker, EC2, S3, RDS, Lambda, managed MLFlow, etc. Hands-on design and implementation of microservices architectures, APIs, and database management (SQL/NoSQL).
  • Proven track record of monitoring and optimizing cloud/ML infrastructure for scalability and cost-efficiency.
  • Deep understanding of the Data Science Lifecycle (DSLC) and proven ability to shepherd data science or ML/AI projects from inception through production within a platform architecture.
  • Expertise in feature stores, model registries, model governance and compliance frameworks specific to ML/AI (e.g. explainability, audit trails).

Responsibilities

  • Design and implement resilient cloud-based ML/AI operational capabilities that advance our system attributes: learnability, flexibility, extensibility, interoperability, and scalability.
  • Architect and implement scalable AWS ML/AI cloud infrastructure to support end-to-end lifecycle of models, agents, and services.
  • Establish governance frameworks for ML/AI infrastructure management (e.g., provisioning, monitoring, drift detection, lifecycle management) and ensure compliance with industry-standard processes.
  • Define and ensure principled validation pathways (testing, QA, evaluation) for early-stage GenAI/LLM/Agent-based proofs-of-concept, across the organization.
  • Lead and provide guidance on Kubernetes (k8s) cluster management for ML workflows, including choosing/implementing workflow orchestration solutions (e.g., Argo vs Kubeflow) and data-pipeline creation, management, and governance using tools such as Airflow.
  • Design and develop infrastructure-as-code (IaC) in AWS CDK (in Python) and/or Terraform along with GitOps to automate infrastructure deployment and management.
  • Monitor, analyze and optimize cloud infrastructure and ML/AI model workloads for scalability, cost-efficiency, reliability, and performance.

Other

  • Foster and contribute to a culture of operational excellence: high-performance, mission-focused, interdisciplinary collaboration, trust, and shared growth.
  • Drive proactive capability and process enhancements to ensure enduring value creation, analytic compounding interest, and operational maturity of the ML/AI platform.
  • Excellent verbal and written communication skills — able to report findings, document designs, articulate trade-offs and influence cross-functional stakeholders.
  • Demonstrated ability to manage large-scale, complex projects across an organization, and lead development of major features with broad impact.
  • Customer-obsessed mindset and a passion for building products that solve real-world problems, combined with high organization, diligence, and ability to juggle multiple initiatives and deadlines.