Ensuring the reliability, scalability, and performance of Point72’s Desktop Platform and critical Equities Applications.
Requirements
- Strong proficiency in Node.js and Angular, with a deep understanding of their ecosystems.
- Experience with cloud platforms such as Openshift.
- Familiarity with containerization technologies like Docker and orchestration tools such as Kubernetes.
- Knowledge of CI/CD pipelines and tools like Jenkins, GitLab CI, or similar.
- Experience with monitoring and logging tools such as Datadog.
- Strong problem-solving skills and ability to troubleshoot complex systems.
Responsibilities
- Analyze system performance and implement improvements to optimize equities application operations.
- Implement monitoring and alerting solutions to proactively identify and resolve issues.
- Develop software services, libraries and tools to automate processes, reduce toil and improve reliability.
- Develop and maintain documentation for systems architecture and operational procedures.
- Conduct root cause analysis of incidents and implement solutions to prevent recurrence.
- Continuously improve system performance, reliability, and scalability through innovative solutions.
Other
- 5+ years of proven experience as a Site Reliability Engineer or similar role.
- Excellent communication skills and ability to work collaboratively in a team environment.
- Commitment to the highest ethical standards.