The Cisco UCS Compute BU is looking to optimize and benchmark the performance of UCS products including AI servers and X-series & C-series servers, with a focus on AI server performance, liquid cooling platform testing, and industry standard AI benchmarking.
Requirements
- Experience with Performance Evaluation of AI platforms using various profiling and benchmarking tools, including but not limited to analyzing computational efficiency, latency, throughput, and resource utilization.
- 2+ years of experience with multiple GPU technology/architectures such as NVIDIA HGX/DGX & AMD instinct accelerators.
- Experience working with cooling or Liquid cooling technologies such as Direct to chip, Immersion cooling.
- Experience working with cluster scale testing. Ideally on AI infrastructure or optimizing UCS compute & AI infrastructure to support high-performance AI/ML training and inference.
- Understanding of high-performance computing AI workloads, and other AI infrastructure components.
- Familiarity with deep learning frameworks (TensorFlow, PyTorch, etc.) and performance optimization.
- Machine Learning/AI Knowledge: Understanding of machine learning models, neural networks, and deep learning architectures.
Responsibilities
- Conduct performance tests on UCS AI platforms.
- Use industry-standard benchmarking tools (e.g., MLPerf, GenAI perf) to evaluate system performance.
- In-depth analysis and evaluation of AI Servers with Liquid cooling technologies.
- Provide technical guidance to improve UCS performance specific to liquid cooling environments.
- Build performance benchmarks, analyze results, and develop technical marketing materials (white papers, standard process guides, presentations).
- Solve Cisco UCS AI server related performance issues.
- Collaborate with engineering, product management, and sales teams to ensure industry-leading UCS performance on AI servers.
Other
- Bachelor’s or master’s degree with 7+ years’ experience in Hardware Engineering, Performance Engineering or similar engineering roles.
- Must be a self-starter and a teammate.
- Ability to set and meet timelines.
- Excellent communication and collaboration skills, with the ability to work effectively with cross-functional teams.
- Onsite role at San Jose