Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Senior Staff Software Engineer - Site Reliability Engineering

Ridgeline

$200,000 - $250,000

Sep 18, 2025

San Ramon, CA, US

Ridgeline is looking to solve the problem of scaling reliability across their cloud-native platform by improving systems like Health Manager, Incident Command, and observability infrastructure, while also driving FinOps tooling and AI-assisted automation to reduce operational burden and surface critical insights.

Requirements

10+ years in software engineering position or similar function, with experience operating large-scale, mission-critical systems
Proficiency in one or more of: Kotlin, Java, JavaScript, Python
Experience with observability platforms (e.g., Datadog, Prometheus) and monitoring best practices
Strong familiarity with infrastructure-as-code tools (e.g., Terraform, CDKTF) and CI/CD systems
Experience leading or participating in incident response and service ownership
Experience deploying, monitoring, and maintaining multi-tenant architectures
Familiarity with AI-assisted tooling or workflows is a plus, but not required

Responsibilities

Build and evolve systems like Health Manager, Incident Command, and observability platforms that support zero-downtime deployments and operational readiness
Partner with development and infrastructure teams to embed reliability into services and processes
Participate in the SRE on-call rotation and lead incident response as needed
Design metrics, tooling, and workflows that enable zero-downtime deployments, fast detection, and proactive issue resolution
Develop and maintain FinOps tooling to drive cost visibility, usage transparency, and financially-informed engineering decisions
Lead incident triage and retrospectives with a blameless, data-driven approach
Define observability signals that make system health visible, actionable, and reliable

Other

You must be work authorized in the United States without the need for employer sponsorship.
Foster an outcomes-focused team culture through honest communication, clarity, and accountability
Think creatively, own problems, seek solutions, and communicate clearly along the way
Contribute to a collaborative environment rooted in learning, teaching, and transparency
Ability to work effectively across teams and communicate technical concepts with clarity
Strong written and verbal communication skills, especially in facilitating incident response and working sessions with service teams
Comfortable navigating ambiguity and working toward measurable outcomes
Proven ability to balance individual contribution with cross-functional impact
Experience or interest in FinOps, cost-aware system design, or cloud usage optimization is a plus
Willingness to learn about cutting-edge technologies while cultivating expertise in a business domain/problem space.
An aptitude for problem solving
Ability to communicate effectively
Serious interest in having fun at work
A systems thinker who brings clarity and direction to complex, ambiguous environments
A strong communicator who can model transparency, collaboration, and constructive disagreement
An engineer who delivers—not just ideas, but real improvements that teams rely on
Passionate about outcomes, not just effort—you prioritize what matters and follow through
Committed to enabling others by reducing friction, building shared tooling, and simplifying operations
Comfortable offering candid feedback and engaging in disagreement with respect and clarity—then committing fully once a decision is made, aligning with the team to drive results
And finally—you have a serious interest in having fun at work