General Motors is seeking a Principal AI Engineer to lead the design and advancement of their AI platform, focusing on accelerating training throughput, scaling multi-modal models, and enabling the next generation of AI-driven driving systems.
Requirements
- Strong programming skills in Python, with proficiency in frameworks such as PyTorch (preferred) or TensorFlow.
- Experience with distributed systems, GPU computing, and cloud environments (AWS, GCP, or Azure).
- Deep expertise with PyTorch 2.x+ and distributed training frameworks.
- Strong skills in profiling, analysis, debugging, and optimizing training performance (e.g., avoiding memory fragmentation, operation fusion).
- Proficiency in C++ for performance-critical components.
Responsibilities
- Architect, build, and optimize core AI/ML platform infrastructure to support massive-scale model training.
- Collaborate with data scientists, ML engineers, and software developers to enable seamless workflows from research to production.
- Drive efficiency in large-scale distributed training and data processing pipelines.
- Establish best practices for reliability, scalability, and performance across the AI/ML platform.
- Provide technical leadership and mentorship, guiding teams on platform design, architecture decisions, and emerging technologies.
- Partner with cross-functional stakeholders to align platform capabilities with business needs and strategic AI initiatives.
Other
- Bachelor’s degree or higher in Computer Science, related field, or equivalent experience.
- 8+ years of professional software engineering experience.
- 4+ years of specialized experience in AI/ML domain (e.g., enabling distributed training for large-scale models).
- Willingness to travel to Sunnyvale, CA as needed.
- If you live within a 50-mile radius of [Atlanta, Austin, Detroit, Warren, Milford or Mountain View], you are expected to report to that location three times a week, at minimum.