The company is seeking to improve operational efficiency and effectiveness of enterprise engineering cloud services, thereby improving customer experience and service resiliency, by building AI Platform and AI Ops capabilities.
Requirements
- Strong background in software development
- Experience in ML ops, dev ops and ML Platform
- Knowledge of cloud infrastructure and networking
- Experience developing and operating high-scale services
- Understanding of how to make cloud-scale services resilient
- Experience with iterative and incremental improvements
- Familiarity with internal tooling at OCI
Responsibilities
- Build cloud service on top of the modern Infrastructure as a Service (IaaS) building blocks at OCI
- Design and build distributed, scalable, fault tolerant software systems
- Participate in the entire software lifecycle – development, testing, CI and production operations
- Lead software projects without needing significant guidance and guide/mentor/coach junior engineers
- Design software architecture for mission critical components and be able to get buy-in from the stakeholders on it including senior members of the team, software architects in the org and management
- Balance between product feature development and production operational concerns like writing runbooks, ops automation, structured logging, instrumentation for metrics and events
- Leverage plethora of internal tooling at OCI to develop, build, deploy and troubleshoot software
Other
- Collaborate with cross functional teams
- Provide technical leadership to the broader organization
- Work seamlessly in a collaborative, agile environment
- Be enthusiastic, passionate individuals with a willingness to learn new technologies
- Participate in on-call for the service with the team