OCI AI Infrastructure is building a cutting-edge, ultra-high-performance GPU cluster based Data Centers designed to support AI/ML/HPC workloads. The OCI Infrastructure Delivery Engineering team will be responsible for building software services and tools to accelerate the process by which OCI grows its compute and network capacity, addressing all aspects of the Data Center and its Physical Assets lifecycle to meet high capacity demands from large and hyperscale customers.
Requirements
- Strong experience in distributed systems with a keen eye on reliability, scalability and performance
- experience developing and operating high-scale services
- understanding of how to make these cloud-scale services resilient
- Proficient at high-level programming languages such as Java, Python, etc.
- Ability to deliver products from the ground up, going through a product life cycle
- Experienced with microservice design patterns and service-to-service communication protocols, along with developing highly reliable services
- Prior hands-on experience developing cloud services and tools on Oracle Cloud/AWS/GCP/Azure
Responsibilities
- assist in defining and developing software for tasks associated with the developing, debugging or designing of software applications
- Provide technical leadership to other software developers
- Specify, design and implement modest changes to existing software architecture to meet changing needs
- building software services and tools to accelerate the process by which OCI grows its compute and network capacity
- address all aspects of the Data Center and its Physical Assets lifecycle from planning and design, to delivery and, eventually, decommissioning
- developing and operating high-scale services
- instill a culture of proactivity within your team
Other
- 5+ years of experience
- work independently
- provide technical leadership to the organization
- recommend and justify major changes to existing products, establishing consensus through data-driven approaches
- knows how to balance speed and quality with iterative and incremental improvements