Braze needs to build, maintain, and evolve its internal infrastructure as a service platform to support rapid global growth and empower other engineering teams to build and deploy quicker, while ensuring high reliability and meeting enterprise-grade SLAs.
Requirements
- 5+ years of full-stack development experience
- Experienced in working on large-scale API-driven systems
- Experienced in application and systems observability
- Experience in building/automating Kubernetes-based operators/custom resources
Responsibilities
- Partner with Braze’s engineering teams on: Defining and implementing IaaS productions that will help them build and deploy quicker
- Make monitoring and alerting alerts on symptoms and not on outages
- Ensure that Braze meets our strict enterprise-grade SLAs with customers
- Develop Braze’s internal Infrastructure as a Service platform (IaaS): Develop, implement, and maintain the software services that provide custom infrastructure services
- Provide centralized/common tooling, services, and automation frameworks that are critical for scaling operations, capacity management
- Reduce operational pain and improve the day-to-day workflow of Braze’s engineering teams by building automation into our IaaS platform
- Be on a PagerDuty rotation to respond to availability incidents and provide support for other engineers
Other
- You think about systems - interfaces, boundaries, edge cases, failure modes, behaviors, and specific implementations
- Have an urge to collaborate, document, and deliver quickly
- Collaborating across the global remote teams, often working asynchronously.
- Document everything so you don't need to learn the same thing (or plan the same work) twice
- Delivering fast to delight our customers - even internal ones