Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Director, AI Model Deployment and Optimization

Lenovo

Salary not specified

Jan 1, 2026

NC, US

Lenovo seeks to drive the development, optimization, and large-scale deployment of cutting-edge AI capabilities across its devices and platforms, ensuring they run seamlessly across a range of computing environments and hardware architectures.

Requirements

Experience: 10+ years in production software development, including AI/ML engineering, with 5+ years in leadership roles. Proven track record in model deployment, optimization, and benchmarking at scale. Demonstrated ability to deliver production-grade AI models optimized for both on-device and cloud environments.
Optimization Techniques: Strong expertise in quantization, pruning, distillation, graph optimization (ONNX, TensorRT), mixed precision, and hardware-specific tuning (GPUs, TPUs, custom accelerators).
Inference Systems: Experience with low-latency serving, batching strategies, caching, and dynamic scaling across clusters.
Cloud Edge Deployment: Deep knowledge of end-to-end deployment of ML/LLM models. Proven ability to deliver across environments — cloud (AWS/GCP/Azure), hybrid, and edge devices.
Tooling Frameworks: Familiarity with PyTorch, TensorFlow, JAX, ONNX Runtime, TensorRT, TVM, and model compilation stacks.
Data Telemetry: Building feedback loops from runtime telemetry to guide retraining, routing, and optimization.
Excellent leadership, communication, and cross-functional collaboration skills.

Responsibilities

Lead and scale Lenovo’s AI model deployment and optimization strategy across devices, laptops, and cloud environments.
Adapt, fine-tune, and optimize open-source foundation models (e.g., OpenAI, Google Gemma) for Lenovo’s product portfolio.
Drive initiatives in model compression, quantization, pruning, and distillation to achieve maximum efficiency on constrained devices while preserving model quality.
Oversee performance evaluation, benchmarking, and iterative improvement cycles for large language models, vision models, and multimodal AI.
Collaborate closely with hardware architecture teams to align AI model efficiency with device and accelerator capabilities.
Develop hardware-aware optimization algorithms and integrate them into model deployment pipelines.
Partner with global engineering, research, and product teams to bring optimized AI-powered features (e.g., “Catch Me Up”) to market.

Other

Establish and maintain reproducible workflows, automation pipelines, and release-readiness criteria for AI models.
Represent Lenovo in AI model optimization research communities, technical working groups, and industry consortiums.
Build, mentor, and inspire a high-performance applied AI engineering team.
Graduate degree (MS or PhD) in Computer Science, AI/ML, Computational Engineering, or related field.
Experience delivering AI features in consumer electronics or embedded platforms.