Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

AI/ML Computing Cluster Engineer

SK hynix America

$100,000 - $150,000

Nov 5, 2025

San Jose, CA, US

SK hynix America is looking to develop and operate high-performance computing clusters to support AI/ML workloads, ensuring scalability, performance, reliability, and cost-effectiveness of their AI data center IT environments.

Requirements

2+ years of experience in AI cluster engineering, MLOps, and benchmark testing, including GPU performance analysis, memory usage, and energy/power monitoring tools.
Strong familiarity with AI computing architecture, AI/ML infrastructure requirements, memory architecture and usages in AI/ML, AI algorithm trends and best practices.
Expertise in optimizing resource utilization, improving system throughput, and reducing latency in both training and inference.

Responsibilities

Design and implement distributed computing cluster infrastructure to support large-scale AI/ML model training and inference jobs with a focus on transformer-based AI models.
Build and maintain distributed system to ensure scalability, efficient resource allocation, and high throughput.
Optimize cluster performance through hardware selection, equipment configuration, network engineering, and performance analysis.
Deploy and operate data center networking infrastructure using software system for automation, design validation, deployment, and operational support.
Implement tools and processes to maintain high uptime and ensure infrastructure reliability during both model training and inference phases.
Identify and resolve performance bottlenecks, improving overall system throughput and response times.
Collaborate with cross-functional teams, including research, security, and benchmark test engineering teams, to integrate infrastructure with AI workflows, ensuring seamless deployment and operation.

Other

Master’s degree or above in Computer Science, Electrical Engineering, or related fields.
Engage with technology vendors and partners to evaluate new solutions to drive innovation in AI computing infrastructure.
Work Model: Onsite
Office Location: San Jose, CA