Oracle's Cloud Infrastructure (OCI) needs to automate the full server lifecycle, from hardware integration to customer-ready instance provisioning and firmware management, at hyperscale. This involves architecting and delivering highly available services and automation pipelines to manage server provisioning, enable firmware pinning, and deliver fleet-wide firmware updates and telemetry-based observability, while supporting new silicon and evolving infrastructure into next-gen clusters and composable hardware environments.
Requirements
- Rock-solid coder and a distributed systems generalist, able to dive deep into any part of the stack and low-level systems and design broad distributed system interactions.
- 6-10 years of software development experience with distributed systems within large-scale environments.
- Proficient in Java, C, or C++ and experience with scripting languages like Python, Perl, etc.
- Experience working on large-scale, highly distributed services infrastructure.
- Deep understanding of operating systems, hardware-software integration, distributed services, and cloud-scale automation.
- Experience designing architectures that demonstrate deep technical depth in one area, or span many products, to enable high availability, scalability, market-leading features and flexibility to meet future business demands.
Responsibilities
- Own the software design and development for major components of Oracle's Cloud Infrastructure.
- Architect and deliver highly available services and automation pipelines that manage server provisioning at hyperscale, enable firmware pinning for deterministic customer environments, and deliver fleet-wide firmware updates and telemetry-based observability.
- Drive solutions to support new silicon (e.g., NVIDIA, AMD, Intel platforms), SmartNIC/HostNIC convergence, RoT security integration, and the evolution of OCI’s infrastructure into next-gen clusters and composable hardware environments.
- Design, develop, troubleshoot and debug software programs for databases, applications, tools, networks etc.
- Build microservices and tooling that provision, configure, secure, and validate server platforms across OCI’s global fleet.
- Interface directly with components like BMCs, NICs, SmartNICs, ILOMs, GPUs, and custom firmware stacks.
- Design architectures that demonstrate deep technical depth in one area, or span many products, to enable high availability, scalability, market-leading features and flexibility to meet future business demands.
Other
- Value simplicity and scale, work comfortably in a collaborative, agile environment, and be excited to learn.
- Systematic problem-solving approach, strong communication skills, a sense of ownership, and drive.
- Able to effectively communicate technical ideas verbally and in writing (technical proposals, design specs, architecture diagrams and presentations).
- Partner closely with teams across Compute, Networking, Security, Datacenter Engineering, and Hardware Development to ensure OCI can launch, scale, and maintain new server platforms with minimal operational overhead and high reliability.