Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

CloudTech Innovations Logo

Senior AI/ML Ops Engineer

CloudTech Innovations

Salary not specified
Sep 9, 2025
Dallas, TX, US
Apply Now

The company is looking to design, deploy, and scale ML/AI systems, specifically enterprise-grade AI solutions that leverage Databricks expertise and AI engineering skills (LLMs, Hugging Face, LangChain, vector databases). The goal is to productionize models, implement automation, and enable scalable AI/ML pipelines.

Requirements

  • Strong hands-on expertise with Databricks (Delta Lake, MLflow, Unity Catalog, Spark).
  • Proficiency in Python and major ML/AI frameworks: TensorFlow, PyTorch, scikit-learn, XGBoost.
  • Experience with Hugging Face Transformers and LangChain for LLM pipelines.
  • Familiarity with vector databases: Pinecone, Weaviate, FAISS, Milvus, Chroma.
  • Strong knowledge of CI/CD, IaC (Terraform/CloudFormation), GitOps, Docker, Kubernetes.
  • Hands-on with cloud AI platforms: AWS SageMaker, Vertex AI, Azure ML.
  • Experience building RAG pipelines, vector DBs and deploying GenAI applications at scale.

Responsibilities

  • Design, build, and manage end-to-end ML/AI pipelines on Databricks (Delta Lake, MLflow, Unity Catalog, Spark).
  • Deploy and optimize LLM-based applications using Hugging Face, LangChain, and vector databases (Pinecone, Weaviate, FAISS).
  • Implement CI/CD pipelines for ML/AI workflows using GitHub Actions, GitLab CI, or Jenkins.
  • Automate processes with Airflow, Prefect, or Kubeflow.
  • Monitor model performance, drift, and compliance using observability tools (Weights & Biases, Arize AI, Evidently AI).
  • Collaborate with Data Scientists to operationalize models built with TensorFlow, PyTorch, scikit-learn, XGBoost.
  • Scale workloads using Docker, Kubernetes (AKS/EKS/GKE) and integrate with cloud AI platforms (AWS SageMaker, GCP Vertex AI, Azure ML).

Other

  • Remote
  • 5-6 years of experience in designing, deploying, and scaling ML/AI systems.
  • 7–8 years of experience in AI/ML Ops, Data Engineering, or related roles.
  • Familiarity with ONNX Runtime, TensorRT, BentoML, or Seldon Core for model serving.
  • Exposure to Generative AI APIs (OpenAI, Anthropic, Cohere, Hugging Face Hub).
  • Prior work in regulated industries (finance, healthcare, insurance).