Reddit is looking for a Software Engineer III to work on their Compute team, focusing on platform engineering and cluster engineering to support multi-cloud, multi-region deployments and optimize intra-cluster performance, efficiency, and stability. The goal is to build and maintain the foundational platform for running Reddit's infrastructure, impacting hundreds of millions of users.
Requirements
- 3+ years of experience developing internet-scale software, preferably in the context of infrastructure.
- Language proficiency in Go.
- Experience developing on top of Kubernetes or similar distributed systems.
- Kubernetes controller or operator development experience is a huge plus.
- Proficiency operating Linux with a solid understanding around cgroups, namespaces, other multi-tenancy primitives.
- Strong troubleshooting capabilities surrounding both systems and software.
- Prior experience engineering large scale distributed systems with a focus on automation and platform engineering is a plus, especially in a role where they were required to be oncall.
Responsibilities
- Software automation that creates, manages, and destroys clusters in our fleet.
- APIs and controllers that support multi-cluster deployment and scheduling mechanics.
- Core SDKs that enable controller development in the larger organization.
- Software that codifies out-of-cluster ancillary concerns such as network configurations and managed services.
- Detection of node-level performance characteristics and making availability decisions based on the data.
- Schedulers that support more efficient packing of resources along with reactive rescheduling on the basis of changing compute availability.
- Kubernetes controllers that offer APIs in the cluster and perform reconciliation to reach a desired state.
Other
- Work collaboratively with a team of software engineers to create and maintain the foundational platform for running Reddit’s infrastructure.
- Contribute feedback to the technical and strategic direction of the compute platform.
- Share on-call responsibilities with the Compute team.
- Excellent communication skills to collaborate with a service-oriented team and compan
- Reddit is proud to be an equal opportunity employer, and is committed to building a workforce representative of the diverse communities we serve.