DDN's A3I solutions are transforming the landscape of AI infrastructure. DDN is the global leader in AI and multi-cloud data management at scale. Our cutting-edge data intelligence platform is designed to accelerate AI workloads, enabling organizations to extract maximum value from their data. This role is the final escalation point for the most complex and critical issues affecting enterprise and hyperscale environments, aiming to reduce time-to-resolution, streamline diagnostics, and elevate the support experience for strategic customers.
Requirements
- Deep understanding of file systems (POSIX, NFS, S3), storage performance, and Linux kernel internals.
- Proven debugging skills at system/protocol/app levels (e.g., strace, tcpdump, perf).
- Hands-on experience with AI/ML data pipelines, container orchestration (Kubernetes), and GPU-based architectures.
- TCP/IP / Network top expert.
- Exposure to RDMA, NVMe-oF, or high-performance networking stacks.
- Experience using AI tools (e.g., log pattern analysis, LLM-based summarization, automated RCA tooling) to accelerate diagnostics and reduce MTTR.
- Expert scripting/coding ability in Python, Bash, or Go.
Responsibilities
- Technical leader of the In-Market Engineering team, driving technical decisions for the team
- Build and implement a tools platform strategy for Infinia.
- Build a reporting platform for Infinia using dial-home data.
- Deliver log analytics strategy and design framework for Infinia
- Own critical customer case escalations end-to-end, including deep root cause analysis and mitigation strategies.
- Utilize AI-powered debugging, log analysis, and system pattern recognition tools to accelerate resolution.
- Feed real-world support insights back into the development cycle to improve reliability and diagnostics.
Other
- 12+ years in enterprise storage, distributed systems, or cloud infrastructure
- Exceptional communication and executive reporting skills.
- This position requires participation in an on-call rotation to provide after-hours support as needed
- strong prioritization skills essential
- strong communication skills in all our engineers and researchers