Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

SK hynix America Logo

AI/ML Computing Cluster Engineer

SK hynix America

$100,000 - $150,000
Nov 5, 2025
San Jose, CA, US
Apply Now

SK hynix America is looking to develop and operate high-performance computing clusters to support AI/ML workloads, ensuring scalability, performance, reliability, and cost-effectiveness of their AI data center IT environments.

Requirements

  • 2+ years of experience in AI cluster engineering, MLOps, and benchmark testing, including GPU performance analysis, memory usage, and energy/power monitoring tools.
  • Strong familiarity with AI computing architecture, AI/ML infrastructure requirements, memory architecture and usages in AI/ML, AI algorithm trends and best practices.
  • Expertise in optimizing resource utilization, improving system throughput, and reducing latency in both training and inference.

Responsibilities

  • Design and implement distributed computing cluster infrastructure to support large-scale AI/ML model training and inference jobs with a focus on transformer-based AI models.
  • Build and maintain distributed system to ensure scalability, efficient resource allocation, and high throughput.
  • Optimize cluster performance through hardware selection, equipment configuration, network engineering, and performance analysis.
  • Deploy and operate data center networking infrastructure using software system for automation, design validation, deployment, and operational support.
  • Implement tools and processes to maintain high uptime and ensure infrastructure reliability during both model training and inference phases.
  • Identify and resolve performance bottlenecks, improving overall system throughput and response times.
  • Collaborate with cross-functional teams, including research, security, and benchmark test engineering teams, to integrate infrastructure with AI workflows, ensuring seamless deployment and operation.

Other

  • Master’s degree or above in Computer Science, Electrical Engineering, or related fields.
  • Engage with technology vendors and partners to evaluate new solutions to drive innovation in AI computing infrastructure.
  • Work Model: Onsite
  • Office Location: San Jose, CA