Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

The Bank of New York Mellon Logo

Site Reliability Engineer Pipeline - Tech - Production Services

The Bank of New York Mellon

$127,000 - $195,000
Oct 3, 2025
New York, NY, USA
Apply Now

BNY is looking to improve the reliability and performance of its Wealth Services Platform by hiring a Site Reliability Engineer to drive reliability, automate infrastructure, and lead incident management.

Requirements

  • Strong expertise in cloud infrastructure (Azure, AWS, or GCP), containerization (Docker, Kubernetes), and Infrastructure as Code (Terraform, Helm).
  • Proficiency in observability and monitoring tools such as Prometheus, Grafana, AppDynamics, Datadog, Splunk, and experience with incident response and on-call support.
  • Solid programming and scripting skills in languages like Python, Go, or Java, with a focus on automation, tooling, and system integration.
  • Deep understanding of SRE principles, including SLAs, SLOs, error budgets, postmortems, and reliability-focused system design.

Responsibilities

  • Drive reliability and performance by defining SLOs/SLIs, improving observability, and proactively identifying and addressing system bottlenecks across cloud environments.
  • Automate infrastructure and operations using Terraform, Kubernetes, and CI/CD tools to eliminate toil and enable scalable, fault-tolerant deployments.
  • Collaborate cross-functionally with product, infrastructure, and DevOps teams to reduce incidents, build resilient services, and ensure architectural clarity.
  • Lead incident management by participating in on-call rotations, conducting postmortems, and implementing automated recovery to minimize downtime.
  • Build and maintain monitoring systems with tools like Prometheus, Grafana, AppDynamics, and Splunk to support real-time alerting and root cause analysis.
  • Develop platform tooling and pipelines for container orchestration, third-party integrations, and cloud-native operations to improve system efficiency and reliability.
  • Mentor engineers and champion SRE best practices, embedding a reliability-first culture and ensuring technical excellence across engineering teams.

Other

  • Collaborate cross-functionally with product, infrastructure, and DevOps teams.
  • Strong collaboration and communication skills, with experience working in Agile environments and partnering with cross-functional engineering, product, and operations teams.
  • Mentor engineers and champion SRE best practices, embedding a reliability-first culture and ensuring technical excellence across engineering teams.