Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Palo Alto Networks Logo

Principal Engineer MLOps - DLP Detection

Palo Alto Networks

Salary not specified
Sep 30, 2025
Santa Clara, CA, USA
Apply Now

The company is looking to lead the design, development, and operation of production-grade machine learning infrastructure at scale, architect robust pipelines, deploy and monitor ML models, and ensure reliability, reproducibility, and governance across their AI/ML ecosystem.

Requirements

  • Strong programming skills (Python, Go, or Java) with deep expertise in building production systems
  • Experience with cloud platforms (AWS, GCP, Azure) and container orchestration (Kubernetes, Docker)
  • Proven experience in ML infrastructure: model serving (TensorFlow Serving, TorchServe, Triton), workflow orchestration (Airflow, Kubeflow, MLflow, Ray, Vertex AI, SageMaker)
  • Hands-on experience with CI/CD pipelines, infrastructure-as-code (Terraform, Helm), and monitoring/observability tools (Prometheus, Grafana, ELK/EFK stack)
  • Strong knowledge of data pipelines, feature stores, and streaming systems (Kafka, Spark, Flink)
  • Understanding of model monitoring, drift detection, retraining pipelines, and governance frameworks

Responsibilities

  • Lead MLOps architecture: Design and implement scalable ML platforms, CI/CD pipelines, and deployment workflows across cloud and hybrid environments
  • Operationalize ML models: Build automated systems for training, testing, deployment, monitoring, and rollback of ML models in production
  • Ensure reliability and governance: Implement model versioning, reproducibility, auditing, and compliance best practices
  • Drive observability & monitoring: Develop real-time monitoring, alerting, and logging solutions for ML services, ensuring performance, drift detection, and system health
  • Champion automation & efficiency: Reduce friction between data science, engineering, and operations by introducing best practices for infrastructure-as-code, container orchestration, and continuous delivery
  • Collaborate cross-functionally: Partner with ML engineers, data scientists, security teams, and product engineering to deliver robust, production-ready AI systems
  • Lead innovation in MLOps: Evaluate and introduce new tools, frameworks, and practices that elevate the scalability, reliability, and security of ML operations

Other

  • In office 3 days a week. Not a remote role.
  • Ability to influence cross-functional stakeholders, define best practices, and mentor engineers at all levels
  • Passion for operational excellence, scalability, and securing ML systems in mission-critical environments