The company is looking to solve the problem of inefficiencies in the medical equipment ordering process, which leads to higher cost of care and poorer patient outcomes.
Requirements
- Minimum 1 year of professional experience building web applications with Python.
- 1 year of experience building or maintaining RAG or other AI-enhanced applications.
- Experience with automated evaluation and testing of LLM powered workflows.
- Experience with LangSmith or similar tools for observability and evaluation.
- Familiarity with prompt engineering and prompt lifecycle management.
- Familiarity with LangGraph, AutoGen, or other agent orchestration frameworks.
- Prior professional experience working with Ruby on Rails.
Responsibilities
- Build and maintain production-grade Python applications, particularly around retrieval-augmented generation (RAG) and agentic systems.
- Collaborate with cross-functional teams to design and deploy AI features that improve user experience and operational efficiency.
- Write well-tested, maintainable code using tools like FastAPI, LangChain, and LangSmith (or similar frameworks).
- Develop and maintain golden datasets for regression testing and evaluation of LLM behaviors.
- Design and implement automated testing strategies for LLM applications, including evaluation harnesses, edge-case detection, and prompt regression tools.
- Contribute to architectural decisions, especially around model integration, orchestration, and performance.
- Participate in full-stack development efforts, including Rails or React when needed.
Other
- Minimum 5 years of professional software engineering experience.
- Strong communication skills and ability to work in an agile, collaborative team.
- Must reside in the U.S. East coast hours preferred.