Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Senior Engineer, AI/ML Software

Analog Devices

$108,800 - $149,600

Nov 6, 2025

Wilmington, MA, US

Analog Devices (ADI) aims to deliver a world-class AI/ML developer experience for its software engineers and data scientists by establishing a global XOps team. This role is crucial for designing and optimizing complete systems, resolving technical issues, and leading the development of major ML/AI operational features to enhance the developer experience across infrastructure, pipelines, deployment, monitoring, governance, and cost/risk optimization.

Requirements

Expert in infrastructure-as-code and GitOps practices, with demonstrable skills in Terraform, AWS CDK (Python), Argo CD and/or other IaC and CI/CD systems.
Hands-on experience managing Kubernetes clusters (for ML workloads) and designing/implementing ML workflow orchestration solutions and data pipelines (e.g., Argo, Kubeflow, Airflow).
Solid understanding of foundation models (LLMs) and their applications in enterprise ML/AI solutions.
Strong background in AWS DevOps practices and cloud architecture — e.g., AWS services such as Bedrock, SageMaker, EC2, S3, RDS, Lambda, managed MLFlow, etc. Hands-on design and implementation of microservices architectures, APIs, and database management (SQL/NoSQL).
Proven track record of monitoring and optimizing cloud/ML infrastructure for scalability and cost-efficiency.
Deep understanding of the Data Science Lifecycle (DSLC) and proven ability to shepherd data science or ML/AI projects from inception through production within a platform architecture.
Expertise in feature stores, model registries, model governance and compliance frameworks specific to ML/AI (e.g. explainability, audit trails).

Responsibilities

Design and implement resilient cloud-based ML/AI operational capabilities that advance our system attributes: learnability, flexibility, extensibility, interoperability, and scalability.
Architect and implement scalable AWS ML/AI cloud infrastructure to support end-to-end lifecycle of models, agents, and services.
Establish governance frameworks for ML/AI infrastructure management (e.g., provisioning, monitoring, drift detection, lifecycle management) and ensure compliance with industry-standard processes.
Define and ensure principled validation pathways (testing, QA, evaluation) for early-stage GenAI/LLM/Agent-based proofs-of-concept, across the organization.
Lead and provide guidance on Kubernetes (k8s) cluster management for ML workflows, including choosing/implementing workflow orchestration solutions (e.g., Argo vs Kubeflow) and data-pipeline creation, management, and governance using tools such as Airflow.
Design and develop infrastructure-as-code (IaC) in AWS CDK (in Python) and/or Terraform along with GitOps to automate infrastructure deployment and management.
Monitor, analyze and optimize cloud infrastructure and ML/AI model workloads for scalability, cost-efficiency, reliability, and performance.

Other

Foster and contribute to a culture of operational excellence: high-performance, mission-focused, interdisciplinary collaboration, trust, and shared growth.
Drive proactive capability and process enhancements to ensure enduring value creation, analytic compounding interest, and operational maturity of the ML/AI platform.
Excellent verbal and written communication skills — able to report findings, document designs, articulate trade-offs and influence cross-functional stakeholders.
Demonstrated ability to manage large-scale, complex projects across an organization, and lead development of major features with broad impact.
Customer-obsessed mindset and a passion for building products that solve real-world problems, combined with high organization, diligence, and ability to juggle multiple initiatives and deadlines.