Salesforce is looking to solve the problem of innovating and maintaining a large scale distributed systems engineering platform that ships hundreds of features to production for tens of millions of users across all industries every day, with a focus on reliability, speed, security, and customization preservation.
Requirements
- Deep knowledge of programming in Java, Golang, or Python
- Experience owning and operating multiple instances of a critical service
- Experience with Agile development methodology and Test Driven Development
- Experience with critical infrastructure services including, monitoring, alerting, logging, and reporting application
- 4+ years backend software development experience
- Deep experience with concurrency, large scale systems, proficiency with solving real-world data management challenges
- Strong understanding of how to craft solutions that are highly available
Responsibilities
- Deliver cloud infrastructure automation tools, frameworks, workflows, and validation platforms on our public cloud platforms such as AWS, GCP, Azure, or Alibaba
- Designing, developing, debugging, and operating resilient distributed systems that run across thousands of compute nodes in multiple data centers
- Using and contributing to open source technology (Spinnaker, Zookeeper, etc.)
- Developing Infrastructure-as-Code using Terraform
- Writing microservices on containerization frameworks such as Kubernetes, Docker, Mesos
- Resolving complex technical issues and drive innovations that improve system availability, resilience, and performance
- Participate in the team’s on-call rotation to address complex problems in real-time and keep services operational and highly available
Other
- A related technical degree required
- 4+ years of experience
- Ability to work in an Agile development environment
- Strong communication and collaboration skills
- Ability to participate in on-call rotation