Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Gridmatic Logo

ML Infrastructure Engineer

Gridmatic

Salary not specified
Oct 8, 2025
Cupertino, CA, US
Apply Now

Gridmatic Inc. is looking to accelerate the decarbonization of the electricity system by building and optimizing the backbone of their ML platform.

Requirements

  • Solid expertise in machine learning, distributed systems and GPU-based training
  • Strong deep learning fundamentals in addition to strong software engineering skills.
  • Experienced in researching and implementing deep learning models.
  • Experienced in distributed training and inference of large models on GPU clusters, utilizing core libraries and frameworks such as PyTorch, PyTorch Lightning, and Ray.
  • Comfortable with large-scale data storage infrastructure and formats, e.g. Zarr, SQL, and feature stores
  • End to end proficiency in building, maintaining, and debugging cluster infrastructure, utilizing Kubernetes and Terraform.
  • Expertise in identifying performance bottlenecks and designing and writing high-performance code for large-scale ML workloads.

Responsibilities

  • Own a significant piece of our ML platform while rapidly building and iterating scalable, robust distributed infrastructure for ML training, inference, and evaluation on large-scale time-series and weather datasets.
  • Optimize throughput and cost by supporting model training and deployment across multiple clusters and clouds.
  • Improve the efficiency of machine learning models and other workloads by optimizing latency, throughput, and memory consumption.
  • Pushing the boundaries of current hardware capabilities through techniques like GPU performance engineering.
  • Help define the long-term vision for Gridmatic’s ML platform.
  • Play a key role in mentoring junior engineers and interns, contributing to a collaborative, innovative, and growth-oriented team culture.

Other

  • 3+ years of experience who is committed to technical excellence.
  • A self-starter with a strong sense of independence and ownership, and the capability to engineer large, robust systems from the initial design and conceptualization to productionization.
  • A mission-driven individual who is enthusiastic about working toward a renewable grid and diving into the intersection of ML and energy.
  • Curiosity and a willingness to learn are must-haves!
  • No prior energy experience required