Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Box Logo

Senior Software Engineer, Reliability

Box

Salary not specified
Aug 14, 2025
Redwood City, CA, US
Apply Now

Box needs to enhance the availability, reliability, and resilience of its systems to drive customer experience and operational excellence.

Requirements

  • Experience coding in higher-level languages (e.g., Java, Scala, Go, Python)
  • Experience designing complex systems and frameworks using proven system design principles, such as NALSD (Non-Abstract Large System Design) methodologies
  • Experience troubleshooting issues across distributed Linux environments, with comfort tracing problems across applications, systems, and networks
  • Proficient with modern cloud technologies such as GCP, AWS, and Kubernetes
  • Experienced in service observability practices and tools (e.g., Prometheus, OpenTelemetry, SignalFx, or similar)
  • Comfortable learning new software, frameworks, and APIs quickly and effectively
  • Familiarity with PHP/JavaScript/NodeJS (bonus)

Responsibilities

  • Build software, frameworks, and tools required for reliable operations of Box's services across multiple cloud environments
  • Manage the stability and operation of several of Box's most critical production applications through application reviews, capacity planning, and performance tuning
  • Develop automations / frameworks / tools for better platform reliability/resilience/availability
  • Participate in product design reviews and architectural discussions to ensure reliability is considered early in the development lifecycle of product/services
  • Participate in a team on-call rotation
  • Improve our observability as both a developer/maintainer of systems/frameworks, and a mentor to our product development teams
  • Work with modern cloud-native technologies including container orchestration (Kubernetes, Docker), service mesh solutions (Istio, Linkerd), and cloud platforms (AWS, GCP)

Other

  • 5+ years of working experience designing, developing, and operating large-scale, customer-facing products or services
  • A strong interest in solving challenging problems using innovative and data-driven approaches
  • An SRE-centric mindset — you build and manage systems with reliability, scalability, availability, and security as core principles
  • Natural collaborator who inspires others, mentors junior engineers, and drives technical excellence
  • Work from assigned office a minimum of 2 days per week, with a focus on Tuesdays and Thursdays