Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Voltai Logo

Research Engineer - CUDA kernel engineering

Voltai

Salary not specified
Nov 6, 2025
Palo Alto, CA, USA
Apply Now

Voltai is looking to develop world models and embodied agents to learn, evaluate, plan, experiment, and interact with the physical world, starting with understanding and building hardware, electronics systems, and semiconductors where AI can design and create beyond human cognitive limits.

Requirements

  • Writing and optimizing CUDA kernels for large-scale AI workloads (attention, routing, graph-based operations, physics-inspired operators, etc.)
  • Profiling and optimizing GPU performance for custom compute or memory-bound workloads
  • Integrating custom kernels into cutting-edge training and inference frameworks (e.g., PyTorch, Megatron, vLLM, TorchTitan)
  • Working with the latest NVIDIA hardware and software stacks (Hopper, Blackwell, NVLink, NCCL, Triton)
  • Building GPU-accelerated primitives for graph reasoning, symbolic computation, or hardware simulation tasks
  • Collaborating with AI researchers and semiconductor experts to translate domain-specific workloads into high-performance GPU code

Responsibilities

  • Develop, integrate, and optimize state-of-the-art CUDA kernels to power AI models that accelerate semiconductor design and verification
  • Enable large-scale model training, inference, and reinforcement learning systems that reason about circuit layouts, generate and validate RTL, and optimize chip architectures — running efficiently across thousands of GPUs
  • Build tools, performance benchmarks, and integration layers that push the limits of GPU utilization for compute-intensive workloads in AI-driven hardware design
  • Work closely with researchers and engineers to help make Voltai the world’s leading AI + semiconductor research organization
  • Release kernels and tooling as contributions to the open-source AI and HPC ecosystems
  • Write and optimize CUDA kernels for large-scale AI workloads
  • Profile and optimize GPU performance for custom compute or memory-bound workloads

Other

  • Collaborating with researchers and engineers
  • Releasing kernels and tooling as contributions to the open-source AI and HPC ecosystems
  • Working closely with AI researchers and semiconductor experts
  • Being part of a team with diverse backgrounds, including previous Stanford professors, SAIL researchers, Olympiad medalists, CTOs, and former US Secretary of Defense
  • Being backed by Silicon Valley’s top investors, Stanford University, and CEOs/Presidents of Google, AMD, Broadcom, Marvell, etc.