Dropbox's large-scale storage systems need to be designed, built, and operated to provide high durability and scalability for millions of users across all of Dropbox products. The Storage team owns the distributed storage infrastructure at the heart of Dropbox, systems responsible for storing exabytes of user data across multiple data centers worldwide.
Requirements
- 8+ years of strong understanding of distributed systems principles, including replication, consistency, and fault tolerance.
- Experience developing and debugging production services in C++, Go, or Rust.
- Familiarity with distributed storage systems, file systems, or data infrastructure at scale.
- Demonstrated ability to write efficient, reliable, and maintainable code in mission-critical environments.
- Experience troubleshooting complex systems and participating in on-call or operational rotations.
- Experience building and operating large-scale object storage or distributed storage systems (e.g. S3, Ceph, GFS/Colossus).
- Familiarity with replication protocols, erasure coding, and data placement algorithms.
Responsibilities
- Design, implement, and maintain large-scale distributed storage systems that ensure data durability, availability, and performance.
- Collaborate with peers to evolve the architecture of Dropbox’s core storage infrastructure for improved scalability and efficiency.
- Contribute to the design of replication, erasure coding, and system lifecycle management systems that balance cost, reliability, and performance.
- Write high-quality, performant, and maintainable code in Go and Rust.
- Participate in the on-call rotation, gaining firsthand experience operating Dropbox’s production storage systems
- Investigate and resolve complex production issues, performing root cause analysis and driving continuous reliability improvements.
- Partner with cross-functional teams (Networking, Hardware, Capacity Planning) to deliver end-to-end reliable and cost-efficient storage solutions
Other
- Take ownership of scoped projects and demonstrate growth toward leading larger, cross-team technical initiatives.
- Solid communication and collaboration skills, with the ability to work across infrastructure and product teams.
- Eagerness to learn, grow, and contribute to multi-year infrastructure evolution initiatives.
- Many teams at Dropbox run Services with on-call rotations, which entails being available for calls during both core and non-core business hours.
- If a team has an on-call rotation, all engineers on the team are expected to participate in the rotation as part of their employment.