SmartEquip is looking to modernize its core infrastructure and rebuild it with a forward-looking, AI-first approach. This involves balancing the development of a modern cloud-native stack with the operational ownership of a legacy environment slated for modernization, while ensuring the stability, scalability, and security of the cloud platform.
Requirements
- Deep, hands-on knowledge of a major cloud provider (AWS or GCP)
- Extensive experience building and running production systems on Kubernetes using Terraform
- Expert-level proficiency with Linux.
- Strong programming skills in Python
- A deep understanding of the Java ecosystem, including experience with both modern Spring frameworks and legacy enterprise systems.
- A strong command of SRE principles, including experience with SLOs, error budgets, and building robust monitoring and alerting systems.
- Skilled at troubleshooting complex distributed systems under pressure.
Responsibilities
- Manage and grow a team of software engineers focused on the core runtime environment.
- Lead the design, implementation, and governance of our cloud infrastructure using Infrastructure as Code principles.
- Champion the practical application of Generative AI to solve real-world platform challenges.
- Apply Site Reliability Engineering (SRE) principles to improve the performance, availability, and security of our production systems.
- Own the monitoring and observability stack, lead incident response, and drive a culture of blameless post-mortems.
- Champion and enforce best practices for documentation and coding standards, ensuring the team's work is accessible, understandable, and trusted by the rest of the engineering organization.
- Assume direct ownership for the operational health, security, and maintenance of our legacy stack.
Other
- Manage and grow a team of software engineers focused on the core runtime environment. You will be responsible for their technical and professional growth by mentoring and guiding their careers, fostering a culture of operational excellence and proactive problem-solving.
- Proven success in managing a team of software or infrastructure engineers. You have a background as a hands-on backend engineer and have successfully transitioned into leadership.
- While deep AI/ML experience isn't required, you have a genuine curiosity and hands-on familiarity with modern GenAI tools and concepts. You are the kind of person who is already exploring how AI can augment and accelerate software engineering.
- You have built a strong, collaborative partnership with the Platform Delivery team and other engineering leaders at SmartEquip.
- No Formal Education