Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

AI & HPC Infrastructure Engineer

Accenture

$68,300 - $218,800

Oct 17, 2025

Overland Park, KS, US

The Global Infrastructure Engineering AI & HPC team at Accenture is looking to solve the problem of enabling infrastructure reinvention for the next era of digital solutions powered by AI and High-Performance Computing (HPC) for their strategic and mission-critical clients.

Requirements

Minimum 4+ year of hands-on experience designing, deploying, and managing HPC and AI infrastructure across on-premises, cloud, and hybrid environments
Minimum 4+ years’ experience of accelerated computing architectures (GPUs, XPUs, DPUs), high-performance fabrics (InfiniBand, Ethernet), SONiC, networking, and modern storage/data platforms
Minimum 4+ year experience with cluster management and orchestration (e.g. Slurm, Run:ai, Kubernetes, Docker), real-time performance monitoring, and observability frameworks
Minimum 4+ years’ experience with cloud and virtualization platforms (e.g. AWS, Azure, GCP, VMware, Nutanix) and expertise in automation and optimization using scripting (Python, AI tools) with foundational Infrastructure-as-Code tools such as Terraform and Ansible
Minimum 4+ year experience implementing MLOps and DevSecOps frameworks to enable secure, automated, and reproducible workflows
Experience managing the deployment of 1,000+ GPU clusters for HPC and AI workloads with various infrastructure services enabled
Experience with GPU computing libraries and accelerators (e.g., NVIDIA CUDA, Dynamo, AMD ROCm)

Responsibilities

Design and implement HPC and AI infrastructure solutions, aligning system architecture and deployment roadmaps to industry-specific performance and scalability needs
Deploy, configure, and manage XPU-based clusters (CPU/GPU/accelerators) using schedulers, VM/K8s orchestration platforms, Slurm, and containerized platforms in scalable designs to provide Metal as a Service (MaaS), GPUaaS, AIaaS, and other offerings
Optimize cluster performance, scalability, energy, and cost efficiency across on-premises, cloud, and hybrid environments
Integrate AI and HPC platforms with existing IT systems, data pipelines, and security frameworks
Monitor, troubleshoot, and tune infrastructure to ensure high availability, low-latency networking, and workload resiliency
Develop and maintain documentation including architecture diagrams, configuration baselines, and operational runbooks
Provide technical guidance and support to users, enabling efficient execution of HPC/AI workloads, large-scale models, and simulations

Other

Travel may be required for this role, with the amount of travel varying from 25% to 100% depending on business need and client requirements
Bachelor's degree or equivalent (minimum 12 years) work experience
Applicants for employment in the US must have work authorization that does not now or in the future require sponsorship of a visa for employment authorization in the United States
Candidates who are currently employed by a client of Accenture or an affiliated Accenture business may not be eligible for consideration
Job candidates will not be obligated to disclose sealed or expunged records of conviction or arrest as part of the hiring process