Sierra is creating a platform to help businesses build better, more human customer experiences with AI. The company needs to ensure its infrastructure is secure, reliable, and scalable to support the rapid growth of its AI platform and LLM inference serving.
Requirements
- Proven experience with cloud platforms (AWS, GCP, or Azure) and infrastructure as code (Terraform preferred).
- Hands-on expertise in CI/CD systems, release management, and container orchestration (e.g., Docker, Kubernetes).
- Experience with observability tools (Prometheus, Grafana, Datadog, OpenTelemetry, etc.).
- Experience in incident response and operating distributed systems in production.
- Production experience working with LLMs and machine learning models.
- Background in distributed systems, running SaaS services at scale, and agentic architecture.
- Familiarity with security and authentication protocols (OAuth, SSO, mTLS).
Responsibilities
- Ensure the reliability, scalability, and performance of our platform and LLM inference serving as we rapidly grow traffic.
- Build and maintain cloud infrastructure using Terraform to ensure scalable, secure, and reproducible environments.
- Create and maintain a self-serve infrastructure platform that enables the rest of engineering to deploy and operate services.
- Own and evolve CI/CD pipelines and release management, enabling fast, reliable deployments for Sierra’s platform.
- Architect and operate distributed systems that leverage distributed databases, retrieval systems, and ML models.
- Develop and maintain core data serving abstractions along with authentication and security features (SSO, RBAC, authentication controls).
- Enhance observability tooling (metrics, logging, tracing) to provide deep visibility into platform health and performance.
Other
- Strong software engineering background with 5–7+ years of hands-on development experience in highly technical products.
- A strong inclination towards building automation, tooling, and platform, along with designing maintainable systems.
- Degree in Computer Science or related field, or equivalent professional experience
- Previous experience in a fast-paced startup environment or platform/infra-focused team.
- Values: Trust, Customer Obsession, Craftsmanship, Intensity, and Family.