Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Staff Machine Learning Engineer - MLAV#01

NavitsPartners

Salary not specified

Dec 16, 2025

Palo Alto, CA, US

Building privacy-preserving large language model (LLM) capabilities to support hardware design workflows involving Verilog/SystemVerilog and RTL artifacts, enabling advanced use cases such as code generation and refactoring, lint explanation, constraint translation, and spec-to-RTL assistance, while operating within strict enterprise security and data-privacy boundaries.

Requirements

3+ years working hands-on with transformers or LLMs
Deep expertise with PyTorch, Hugging Face (Transformers, PEFT, TRL), and distributed training frameworks (DeepSpeed, FSDP).
Experience with quantization-aware fine-tuning, constrained decoding, and evaluation of code-generation models.
Amazon Bedrock (model usage, customization, Guardrails, runtime APIs, VPC endpoints)
SageMaker (Training, Inference, Pipelines)
Core services: S3, EC2/EKS, IAM, KMS, VPC, CloudWatch, CloudTrail, Secrets Manager
Solid software engineering fundamentals: testing, CI/CD, observability, and performance optimization.

Responsibilities

Own the technical roadmap for RTL-focused LLM capabilities, from model selection and fine-tuning through deployment and continuous improvement.
Fine-tune and customize transformer models using modern techniques such as LoRA/QLoRA, PEFT, instruction tuning, and preference optimization (RLAIF).
Design and operate HDL-aware evaluation frameworks, including: Compile, lint, and simulation pass rates, Pass@k metrics for code generation, Constrained/grammar-guided decoding, Synthesis-readiness checks
Build and maintain secure, privacy-first ML pipelines on AWS, including: Amazon Bedrock for managed foundation models, SageMaker and/or EKS for bespoke training and inference, Encrypted storage (S3 + KMS), private VPCs, IAM least privilege, CloudTrail auditing
Deploy and operate low-latency, production inference using Bedrock and/or self-hosted stacks (vLLM, TensorRT-LLM), with autoscaling and safe rollout strategies.
Establish a strong evaluation and MLOps culture with automated regression testing, experiment tracking, and model documentation.
Drive product integration with internal developer tools, CI workflows, IDE plug-ins, retrieval-augmented generation (RAG), and safe tool-use.

Other

Staff-level role will provide technical leadership
Lead and mentor a small team of applied ML engineers and scientists; review designs and code, remove technical blockers, and drive execution.
Partner with hardware engineering, EDA, security, and legal stakeholders to ensure compliant data sourcing, anonymization, and governance.
Mentor engineers on LLM best practices, reproducible experimentation, and secure system design.
Proven experience shipping LLM-powered features to production and leading cross-functional technical initiatives.
Excellent communication skills and the ability to influence both technical and executive stakeholders.