Robinhood is looking to solve the world's biggest financial problems by building an elite team applying frontier technologies to democratize finance for all. The Reliability Engineering team specifically aims to ensure the reliability, scalability, performance, and security of systems powering millions of users.
Requirements
- 8+ years experience in designing, building, and maintaining large-scale, distributed systems
- Proficiency in programming languages such as Python/Go/C++
- Expertise in operating systems (Linux/Unix), networking, and troubleshooting sophisticated production issues in high-availability environments.
- Built and owned the pre-production and staging environments for internal software engineers.
- Experience running on Elastic Kubernetes Service (EKS) on AWS or another cloud provider
- Experience working with Observability systems with a goal of reducing incident metrics such as Mean-Time-To-Detect (MTTD) and Mean-Time-To-Resolve (MTTR)
- Experience working with large Infrastructure components such as compute, storage networking and/ or developer infrastructure
Responsibilities
- Design, build, and maintain large-scale systems that power Robinhood’s platform, infrastructure, and core services
- Write and review high-quality code, create capacity and scaling plans, and debug complex, real-time issues in mission-critical systems used by millions of customers.
- Take ownership of system reliability by participating in on-call rotations, proactively addressing potential issues, and driving long-term improvements to reduce downtime.
- Collaborate with industry-leading engineers to develop scalable tools and infrastructure that meet Robinhood’s growing demands.
- Drive innovation by optimizing infrastructure for reliability and cost-efficiency, supporting Robinhood’s mission to democratize finance for all at a global scale.
- Applications including brokerage, crypto and money
- Service Level Agreements (SLAs) and Service Level Objectives (SLOs)
Other
- Lead by example, mentoring teammates, promoting best practices, and fostering a culture focused on operational excellence and system resilience.
- A track record of mentoring team members, fostering collaboration, and contributing to a culture of continuous improvement.
- This role is based in our Menlo Park office, with in-person attendance expected at least 3 days per week.
- If our mission energizes you and you’re ready to build the future of finance, we look forward to seeing your application.