Coursedog's Platform team needs to architect and deliver AI infrastructure for Intelligent Academic Operations, enabling product pods to rapidly build and ship AI-powered capabilities.
Requirements
- Proven track record in AI/ML systems development and deployment at scale.
- Hands-on experience with orchestration and lifecycle tools (LangSmith, Airflow, MLFlow, Kubeflow, or similar).
- Strong backend expertise in Node.js and REST API design, JavaScript and TypeScript
- Experience building modern SPAs with Vue 3.
- Solid understanding of relational and document databases (PostgreSQL, MongoDB).
- Ability to design frameworks and tooling that abstract complexity and empower other teams
- Experience with observability platforms (Prometheus, Grafana, OpenTelemetry) for AI systems.
Responsibilities
- Build and maintain reusable AI services, APIs, and orchestration pipelines (e.g., LangSmith, Airflow, MLFlow) to support cross-team AI initiatives.
- Provide tooling, SDKs, and documentation so product teams can easily integrate AI features into their modules without deep ML expertise.
- Implement robust workflows for data ingestion, model training, evaluation, deployment, and monitoring at scale.
- Build backend APIs in Node.js and deliver reference frontends in Vue to demonstrate and validate platform capabilities.
- Partner with product pod engineers to guide AI adoption best practices, debug integration issues, and promote platform consistency.
- Influence system design for scalability, performance, and maintainability across multiple product lines.
Other
- Unlimited Paid Time Off policy
- Remote-First Since Inception
- All employees are granted equity in the company
- Primary caregivers are eligible for 12 weeks of paid leave
- Secondary caregivers receive 6 weeks of paid leave