Quantiphi is seeking an Infrastructure Architect to design, implement, and manage robust hybrid environments capable of supporting high-compute AI and GenAI workloads for a key enterprise client. The role aims to bridge infrastructure, DevOps, and AI solution delivery to provide the necessary foundational stack for scaling advanced AI workloads.
Requirements
- Proven track record in designing and deploying AI/ML or GenAI-supporting infrastructure (e.g., GPU clusters, Kubernetes for ML workloads, hybrid vector databases).
- Deep knowledge of cloud services (GCP preferred; AWS or Azure acceptable), on-prem virtualization, storage, networking, and container orchestration.
- Experience supporting multi-agentic GenAI frameworks, including task orchestration, distributed agents, and workflow automation.
- Hands-on experience in DevOps and IaC tools (Terraform, Helm, Ansible, CI/CD).
- Deep hands-on expertise in architecting and managing solutions on Google Cloud Platform, including VPC design, subnetting, firewall rules, Private Service Connect, and Cloud Interconnect for secure hybrid networking.
- Strong understanding of compute services tailored to GenAI: Compute Engine for custom VM/GPU provisioning (A100/H100, T4), GKE (Google Kubernetes Engine) for containerized model deployments, including support for GPU workloads and node auto-provisioning.
- Experience deploying and optimizing infrastructure for LLM hosting using Triton Inference Server, vLLM, or Text Generation Inference on GKE or Compute Engine.
Responsibilities
- Architect and implement secure, scalable, and cost-effective infrastructure solutions across on-prem and cloud (GCP, AWS, Azure) environments.
- Evaluate existing systems and define migration or integration strategies for deploying AI/GenAI workloads in hybrid setups.
- Design infrastructure supporting GPU-intensive workloads, distributed training, inferencing, and vector database storage.
- Manage provisioning, automation, and orchestration across virtual machines, containers, and Kubernetes clusters.
- Implement and monitor high-availability, low-latency, and disaster recovery strategies.
- Optimize infrastructure for latency-sensitive applications, including real-time GenAI agentic workflows.
- Work closely with AI/ML engineers, data scientists, solution architects, and DevOps to ensure smooth deployment and scaling of models and GenAI agents.
Other
- Bachelor’s or Master’s degree in Computer Science, Information Systems, or related field.
- 8–15 years of experience in enterprise infrastructure architecture, with significant experience in both on-prem and cloud-native environments.
- Strong problem-solving and debugging skills.
- Ability to communicate technical concepts clearly to non-technical stakeholders.
- Collaborative mindset with ability to work cross-functionally across AI, DevOps, and business teams.