Ensuring the reliability and robustness of Rubrik's products by focusing on stress, scale, longevity, and resilience.
Requirements
- Strong knowledge of data structures, algorithms, and software design
- Solid programming skills in one or more programming languages (Python preferred)
- Building AI based applications/workflows using LLMs
- Working knowledge of virtualization, container technologies, storage, database, network
- Experience with Google Cloud Platform/AWS/Azure or other public cloud technologies
- Building high scale & performant products.
- Knowledge of CI/CD solutions like Jenkins, Ansible, ELK
Responsibilities
- Drive release reliability certification for Rubrik Cloud Data Management and Rubrik Security Cloud-Private products.
- Architect and build scalable infrastructure and efficient pipelines for stress, scale and resilience/chaos testing.
- Develop and deploy simulators to optimize cost efficiency and accelerate testing.
- Enable Product teams with self-service tools and infrastructure for their validation needs.
- Maintain and evolve long-running, customer-like environments to proactively identify potential issues.
- Design & build infrastructure automation to enable on-demand building of complex product deployments similar to customer deployments and system stress/performance pipelines
- Develop and enhance tools for monitoring, alerting and telemetry of customer-like deployments.
Other
- BS or MS in Computer science or related field with a minimum of 2 years of relevant work experience
- Ability to work collaboratively in a team environment, including quickly getting up to speed with new technologies.
- US Pay Range $152,400—$228,700 USD