Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Fractal Logo

LLM Ops Databricks Specialist

Fractal

$120,000 - $145,000
Aug 19, 2025
CA, US
Apply Now

Fractal is looking for an LLMOps Engineer with Databricks expertise to operationalize large language models, implement scalable ML infrastructure, and drive innovation in AI/ML deployment practices.

Requirements

  • 6+ years of software development experience with strong programming skills in Python, SQL
  • 2+ years of hands-on experience with Databricks platform, including MLflow, Delta Lake, and Spark
  • 1+ years of experience with machine learning operations, model deployment, and lifecycle management
  • Proficiency with at least one major cloud provider (AWS, Azure, or GCP) and their ML services
  • Experience with Docker, Kubernetes, and container orchestration for ML workloads
  • Strong experience in designing, building, and maintaining production-grade APIs for ML services
  • Proficiency with Git, CI/CD pipelines, and DevOps practices

Responsibilities

  • Design, implement, and maintain end-to-end pipelines for LLM training, fine-tuning, validation, and deployment
  • Build and optimize scalable infrastructure for large language model operations using Databricks platform
  • Deploy LLMs to production environments with prompt management, observability, serverless deployment, proper monitoring, scaling, and performance optimization
  • Design, develop, and maintain RESTful APIs endpoints for LLM inference and model interactions
  • Ensure API reliability, performance optimization, rate limiting, authentication, and comprehensive documentation
  • Develop and maintain CI/CD pipelines for model versioning, testing, and automated deployment
  • Implement comprehensive monitoring solutions for model performance, drift detection, and system health metrics

Other

  • Passionate about learning new technologies, investigating cutting-edge techniques, and providing informed technical decisions.
  • Ability to design scalable, maintainable, and efficient systems
  • Demonstrated ability to quickly learn and adapt to new technologies and methodologies
  • Commitment to code quality, testing practices, and operational excellence
  • Excellent written and verbal communication skills for technical and non-technical audiences