Mastercard is seeking a Senior BizOps Engineer to ensure the reliability, scalability, and performance of their applications, supporting essential services that power Mastercard's global operations.
Requirements
- Ability to read, write, and understand code in one of the programming languages.
- Strong understanding of DevOps principles, practices along with configuration management.
- Experience in operational and resilience designing, building, and operating large-scale, distributed systems.
- Appetite for change and pushing the boundaries of what can be done with automation.
- Experience with algorithms, data structures, scripting, pipeline management, and software design.
- Systematic problem-solving approach, analytical, coupled with strong communication skills and a sense of ownership and drive.
- Interest in designing, analyzing, and troubleshooting large-scale distributed systems.
Responsibilities
- Serve as the primary contact responsible for the overall application health, performance, and capacity Support services before they go live through activities such as system design consulting, capacity planning and launch reviews.
- Partner with the development and product team of a new application to establish the right monitoring and alerting strategy and create the framework to achieve zero downtime during deployment.
- Performs operability and resilience design and implements and maintains highly reliable and scalable infrastructure.
- Perform root cause analysis of incidents and collaborate with development teams to resolve issues.
- Stay up to date with the latest technologies and trends in SRE and cloud computing.
- Complete end-to-end run ownership of the product.
- Practice sustainable incident response and blameless post-mortems while taking a holistic approach to problem solving and optimizing time to recover.
Other
- BS degree in Computer Science or related technical field involving coding (e.g., physics or mathematics), or equivalent practical experience.
- Strong leadership and mentoring skills.
- Ability to balance doing things right with fixing things quickly.
- Flexible and pragmatic, while working towards improving the long-term health of the system.
- Comfortable collaborating with cross-functional teams to ensure that expected system behavior is understood, and monitoring exists to detect anomalies