Tonic.ai is looking for a Machine Learning Engineer to build production-grade NLP systems that power their data privacy and information extraction products, specifically focusing on developing and fine-tuning models to detect and redact sensitive information across diverse datasets.
Requirements
- Proficiency in Python and deep learning frameworks such as PyTorch and Hugging Face Transformers
- Hands-on experience with experiment tracking (e.g., Weights & Biases), distributed training (e.g., Accelerate), and model serving (e.g., vLLM)
- Experience with supervised and reinforcement learning fine-tuning (e.g. TRL)
- Familiarity with data privacy, PII redaction, or healthcare data
Responsibilities
- Build and ship models. Fine-tune and evaluate transformer-based models (e.g., RoBERTa, Gemma, LLaMA) to support PII redaction, entity extraction, and synthetic data generation.
- Own the ML lifecycle. From dataset curation and experiment tracking to model deployment and monitoring — you’ll own the full path from prototype to production.
- Collaborate cross-functionally. Partner with Product and Design to shape how ML models drive user-facing features, and work with the broader engineering team to integrate them into scalable systems.
- Experiment responsibly. Document your experiments, evaluate results rigorously, and help push the frontier of safe and explainable AI for data privacy.
Other
- 3+ years of professional experience in applied ML or data science with a focus on NLP
- Comfort working independently and iterating quickly — you enjoy the mix of research, engineering, and product thinking
- Strong communication and collaboration skills
- A public portfolio, blog, or open-source contributions that demonstrate your technical depth and curiosity
- Remote-friendly work environment