Lenovo is seeking a technical leader to head their AI Model Deployment & Optimization team to drive the development and large-scale deployment of cutting-edge AI capabilities across Lenovo devices and platforms, adapting, fine-tuning, and optimizing foundation models for performance, efficiency, and user impact across various computing environments and hardware architectures.
Requirements
- Strong expertise in quantization, pruning, distillation, graph optimization, mixed precision, and hardware-specific tuning (NPUs, GPUs, TPUs, custom accelerators).
- Familiarity with model inference frameworks such as ONNX Runtime, TensorRT, TVM, OpenVino, RadeonML, QNN, and NeuroPilot.
- Building feedback loops from runtime telemetry to guide retraining, routing, and optimization.
- Experience delivering production-grade model inferencing solutions for both cloud and edge
- Familiarity with edge device constraints related to Windows or Android, preferably both.
- Track record of collaboration with research institutions and contributions to open-source AI optimization libraries.
- Security & Compliance: Ensuring secure deployments, model integrity verification, and adherence to privacy regulations.
Responsibilities
- Lead Lenovo’s AI model deployment and optimization across devices, laptops, and cloud environments.
- Adapt, fine-tune, and optimize open-source foundation models (e.g., OpenAI, Google, Microsoft, Meta) for Lenovo’s product portfolio.
- Drive initiatives in model compression, quantization, pruning, and distillation to achieve maximum efficiency on constrained devices while preserving model quality.
- Collaborate closely with hardware architecture teams to align AI model efficiency with device and accelerator capabilities.
- Develop hardware-aware optimization algorithms and integrate them into model deployment pipelines.
- Utilize the latest AI frameworks and libraries from the industry to get the best inference performance out of the model and the hardware.
- Establish and maintain reproducible workflows, automation pipelines, and release-readiness criteria for AI models.
Other
- 10+ years in production software development, including AI/ML engineering, with 3+ years in leadership roles.
- Proven track record in model deployment and optimization at scale.
- Demonstrated ability to deliver production-grade AI models optimized for on-device and/or cloud environments.
- Graduate degree (MS or PhD) in Computer Science, AI/ML, Computational Engineering, or related field.
- Excellent leadership, communication, and cross-functional collaboration skills.