Crusoe's mission is to accelerate the abundance of energy and intelligence by crafting the engine that powers a world where people can create ambitiously with AI — without sacrificing scale, speed, or sustainability. The company is seeking to architect, design, and develop Cloud Infrastructure management systems and platforms to deliver E2E use cases and workflows for a vertically integrated AI-First Crusoe Cloud, playing a crucial role in building systems and platforms to efficiently plan, monitor, deploy and operate Crusoe Cloud and deliver on key business revenue metrics.
Requirements
- 10+ years of experience building and operating distributed systems at scale.
- Proven experience with building reliable, scalable, efficient, and secure cloud platforms and systems and effectively running them in production environments.
- Fluency in programming languages such as Go, Rust, Java or C++.
- Understanding of cloud security best practices and the ability to implement secure configurations.
- Excellent troubleshooting and problem-solving skills to tackle complex infrastructure issues.
- Hands-on experience deploying, managing, and troubleshooting Kubernetes clusters.
- Experience working in a fast-paced, startup environment
Responsibilities
- Collaborate extensively across teams to architect, design, implement physical infrastructure management software systems, availability platforms, and frameworks to meet E2E use cases of our customers we host on our AI Infrastructure and provide best customer experience.
- Champion the reliability, scalability, and security of our systems and platforms – you’ll be the guardian of our infrastructure!
- Develop workflows to drive efficiency and meet key business objectives and metrics.
- Design and implement high-performing, highly available cloud architectures optimizing for both performance and cost-effectiveness.
- Streamline cloud deployment, configuration management, and operations by developing and maintaining effective platforms, interfaces, and automation tooling.
- Actively contribute to the evolution of our platform, collaborating closely with cross-functional development teams to ensure smooth integration and deployment.
- Evaluate and hands-on implementing and building platforms, tools, and frameworks, focusing on reliability, scalability, operational efficiency, and ease of use.
Other
- A Bachelor’s degree in Computer Science or Software Engineering, and 10+ years of relevant experience.
- A collaborative approach (platform mindset) to working with development and operations teams to build and maintain a robust platform and effectively drive adoption.
- Excellent communication skills
- Embody the Company values.
- A passion for building an energy-first scalable AI Infrastructure.
- A passion for sustainability and innovation