OCI is looking to solve the problem of scaling and optimizing AI infrastructure for customers' AI workloads, by creating a GPU focused cloud and integrating with frontier model providers such as OpenAI, xAI and Gemini.
Requirements
- Programming languages including, but not limited to, C, C++, C-Sharp, Java, Go, Rust
- Experience designing and developing large-scale distributed systems, services, and infrastructure
- Experience in managing cloud infrastructure with hundreds of thousands of servers
- Experience in containerization technologies such as Docker and Kubernetes
Responsibilities
- Own and build solutions to scale and optimize partner model integration with the goal to optimize customer experience and customer workload performance.
- Set and communicate individual expectations and team goals such that they align with the broader organization goals.
- Model and coach team members and drive modern software engineering practices like leveraging data/telemetry to make decisions, well-defined interfaces across components, design reviews, coding standards, code reviews, and comprehensive coverage from unit test, integration test and active production monitoring.
- Prioritize team’s work with focus on customer issues and requirements.
- Ensure that team solutions are well-defined and modularized, secure, reliable, diagnosable, actively monitored, compliant and reusable.
- Create roadmap, define SMART goals, and track team progress against committed OKRs.
Other
- BS (or equivalent experience) in Computer Science, Engineering, or related field
- 6 years of experience in software development
- 2 years of experience in people management or leadership role while working on cross-functional projects
- Strong communication, collaboration, and project management skills
- Ability to adapt to a fast-paced, dynamic environment and manage multiple tasks and priorities effectively