Oracle Cloud Infrastructure (OCI) is looking to shape the future of OCI Networking using AI by automating, optimizing, and securing networks
Requirements
- Strong Python and ML frameworks (PyTorch, TensorFlow)
- LLMs, embeddings, vector search, RAG pipelines, and fine-tuning
- Data engineering: Spark, Kafka, Flink, OCI Streaming/Data Flow
- Distributed systems and large-scale training/inference
- Containerization, model serving, GPU workflows, CI/CD, and MLOps tools
Responsibilities
- Design and implement scalable orchestration for serving and training AI/ML models
- Explore and incorporate contemporary research on AI, agents, and inference systems into the software stack for designing, monitoring, troubleshooting and deploying networks
- Evaluate, Integrate, and Optimize technologies across the stack, for latency, throughput, and resource utilization for training and inference workloads
- Lead initiatives in AI systems design, including Retrieval-Augmented Generation (RAG) and LLM fine-tuning
- Design and develop scalable services and tools to support GPU-accelerated AI pipelines, Python/Go, and observability frameworks
Other
- BSEE, BSCS, BSCE, or equivalent
- At least 3-5 years of experience building software systems and built AI applications training models
- Strong problem-solving skills, attention to detail, and excellent communication skills
- US: Hiring Range in USD from: $79,200 - $178,100 per year
- Comprehensive benefits package including medical, dental, and vision insurance, 401(k) Savings and Investment Plan, and paid time off