Fractal is looking for an LLMOps Engineer with Databricks expertise to operationalize large language models, implement scalable ML infrastructure, and drive innovation in AI/ML deployment practices.
Requirements
- 6+ years of software development experience with strong programming skills in Python, SQL
- 2+ years of hands-on experience with Databricks platform, including MLflow, Delta Lake, and Spark
- 1+ years of experience with machine learning operations, model deployment, and lifecycle management
- Proficiency with at least one major cloud provider (AWS, Azure, or GCP) and their ML services
- Experience with Docker, Kubernetes, and container orchestration for ML workloads
- Strong experience in designing, building, and maintaining production-grade APIs for ML services
- Proficiency with Git, CI/CD pipelines, and DevOps practices
Responsibilities
- Design, implement, and maintain end-to-end pipelines for LLM training, fine-tuning, validation, and deployment
- Build and optimize scalable infrastructure for large language model operations using Databricks platform
- Deploy LLMs to production environments with prompt management, observability, serverless deployment, proper monitoring, scaling, and performance optimization
- Design, develop, and maintain RESTful APIs endpoints for LLM inference and model interactions
- Ensure API reliability, performance optimization, rate limiting, authentication, and comprehensive documentation
- Develop and maintain CI/CD pipelines for model versioning, testing, and automated deployment
- Implement comprehensive monitoring solutions for model performance, drift detection, and system health metrics
Other
- Passionate about learning new technologies, investigating cutting-edge techniques, and providing informed technical decisions.
- Ability to design scalable, maintainable, and efficient systems
- Demonstrated ability to quickly learn and adapt to new technologies and methodologies
- Commitment to code quality, testing practices, and operational excellence
- Excellent written and verbal communication skills for technical and non-technical audiences