Build the future of inference, GPU optimization and AI infrastructure.
Requirements
- GPU Fundamentals: Deep understanding of GPU architectures, CUDA programming, and parallel computing patterns.
- Deep Learning Frameworks: Proficiency in PyTorch, TensorFlow, or JAX, particularly for GPU-accelerated workloads.
- LLM/AI Knowledge: Strong grounding in large language models (training, fine-tuning, prompting, evaluation).
- Systems Engineering: Proficiency in C++, Python, and possibly Rust/Go for building tooling around CUDA.
- Torch/PyTorch
- C++
- Python
Responsibilities
- Build scalable infrastructure for AI model training and inference
- Lead technical decisions and architecture choices
Other
- Internship
- Will sponsor
- Connect directly with founders of the best YC-funded startups.