Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Senior Software Engineer, Reliability

Salary not specified

Aug 14, 2025

Redwood City, CA, US

Box needs to enhance the availability, reliability, and resilience of its systems to drive customer experience and operational excellence.

Experience coding in higher-level languages (e.g., Java, Scala, Go, Python)
Experience designing complex systems and frameworks using proven system design principles, such as NALSD (Non-Abstract Large System Design) methodologies
Experience troubleshooting issues across distributed Linux environments, with comfort tracing problems across applications, systems, and networks
Proficient with modern cloud technologies such as GCP, AWS, and Kubernetes
Experienced in service observability practices and tools (e.g., Prometheus, OpenTelemetry, SignalFx, or similar)
Comfortable learning new software, frameworks, and APIs quickly and effectively
Familiarity with PHP/JavaScript/NodeJS (bonus)

Build software, frameworks, and tools required for reliable operations of Box's services across multiple cloud environments
Manage the stability and operation of several of Box's most critical production applications through application reviews, capacity planning, and performance tuning
Develop automations / frameworks / tools for better platform reliability/resilience/availability
Participate in product design reviews and architectural discussions to ensure reliability is considered early in the development lifecycle of product/services
Participate in a team on-call rotation
Improve our observability as both a developer/maintainer of systems/frameworks, and a mentor to our product development teams
Work with modern cloud-native technologies including container orchestration (Kubernetes, Docker), service mesh solutions (Istio, Linkerd), and cloud platforms (AWS, GCP)

5+ years of working experience designing, developing, and operating large-scale, customer-facing products or services
A strong interest in solving challenging problems using innovative and data-driven approaches
An SRE-centric mindset — you build and manage systems with reliability, scalability, availability, and security as core principles
Natural collaborator who inspires others, mentors junior engineers, and drives technical excellence
Work from assigned office a minimum of 2 days per week, with a focus on Tuesdays and Thursdays