Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

GEICO Logo

Software Engineer - AI/ML Infra

GEICO

$90,000 - $185,000
Nov 14, 2025
Chevy Chase, MD, US
Apply Now

GEICO AI ML Infra team is seeking an exceptional Senior ML Platform Engineer to build and scale our machine learning infrastructure with a focus on Large Language Models (LLMs) and AI applications.

Requirements

  • Proficient in Python; strong skills in Go, Rust, or Java preferred
  • Proven experience working with open source LLMs (Llama 2/3, Qwen, Mistral, Gemma, Code Llama, etc.)
  • Proficient in Kubernetes including custom operators, helm charts, and GPU scheduling
  • Deep expertise in Azure services (AKS, Azure ML, Container Registry, Storage, Networking)
  • Experience implementing and operating feature stores (Chronon, Feast, Tecton, Azure ML Feature Store, or custom solutions)
  • Hands-on experience with inference optimization using vLLM, TensorRT-LLM, Triton Inference Server, or similar
  • Advanced experience with Azure DevOps, GitHub Actions, Jenkins, or similar CI/CD platforms

Responsibilities

  • Design and implement scalable infrastructure for training, fine-tuning, and serving open source LLMs (Llama, Mistral, Gemma, etc.)
  • Architect and manage Kubernetes clusters for ML workloads, including GPU scheduling, autoscaling, and resource optimization
  • Design, implement, and maintain feature stores for ML model training and inference pipelines
  • Build and optimize LLM inference systems using frameworks like vLLM, TensorRT-LLM, and custom serving solutions
  • Ensure 99.9%+ uptime for ML platforms through robust monitoring, alerting, and incident response procedures
  • Design and implement ML platforms using DataRobot, Azure Machine Learning, Azure Kubernetes Service (AKS), and Azure Container Instances
  • Develop and maintain infrastructure using Terraform, ARM templates, and Azure DevOps

Other

  • Excellent verbal and written communication skills with a proven ability to work independently and in a team environment.
  • Mentor junior engineers and data scientists on platform best practices, infrastructure design, and ML operations
  • Lead comprehensive code reviews focusing on scalability, reliability, security, and maintainability
  • Work closely with data scientists to understand requirements and optimize workflows for model development and deployment
  • At this time, GEICO will not sponsor a new applicant for employment authorization for this position.