Metronome is the leading usage-based billing platform built for modern software companies. Our platform computes millions of invoices per billing period and is scaling rapidly to accommodate new customers, saving them hours of development time and manual invoicing and enabling them to use consumption data to better serve their customers.
Requirements
- Hands-on experience with distributed systems, cloud infrastructure, container orchestration, data pipelines, observability, CI/CD, or other foundational platforms.
- Track record of operating mission-critical infrastructure with strong focus on reliability, scalability, and performance.
- Cloud Infrastructure (AWS: EKS, ECS, S3, RDS, Lambda, and more)
- Container Orchestration (Kubernetes, Docker)
- Infrastructure as Code (Terraform)
- Streaming & Batch Processing (Kafka, Kafka Streams, Spark)
- Languages (Python, TypeScript, Java, Go)
Responsibilities
- Design and operate foundational infrastructure—Kubernetes clusters, Kafka streaming platforms, Spark batch processing, observability systems—that handle billions of events and enable Metronome to grow with minimal friction.
- Create golden paths, abstractions, and tooling that let engineers ship faster and more reliably without becoming infrastructure experts themselves.
- Take accountability for system uptime, performance, and correctness. Build monitoring, alerting, and incident response systems that enable the entire team catch problems before customers notice.
- Shape Metronome's infrastructure strategy, make platform-level architectural decisions, and mentor engineers across the organization.
- Build platforms that scale
- Enable product velocity
- Enable reliability as the product
Other
- 5+ years building infrastructure systems
- Ownership of production systems
- Force multiplier mindset
- Cross-functional collaboration
- You partner effectively with product teams, communicate technical decisions clearly, and mentor engineers across experience levels.