10a Labs is looking to deploy, monitor, and scale a real-time ML-powered content moderation system to detect and triage abuse, threats, and edge-case language.
Requirements
- Has 3–8 years of experience deploying machine learning systems or high-availability backend systems.
- Has shipped and maintained production infrastructure at scale, supporting ML workflows.
- Has experience with GCP, AWS, or similar platforms (including managed ML services).
- Is proficient in Terraform, Docker, Kubernetes, or similar infra tools.
- Understands performance tradeoffs in serving models and embedding search pipelines.
- Experience with vector databases or ANN systems, preferably within GCP (or AWS).
- Experience serving LLMs or embedding-based models via API.
Responsibilities
- Design and maintain cloud infrastructure (GCP or AWS) to support real-time model serving, data ingestion, and evaluation workflows.
- Deploy and optimize APIs for low-latency access to ML models and embedding search systems.
- Manage and optimize the end-to-end training data flow—from sourcing and cleaning datasets to preparing them for model consumption—ensuring accuracy, scalability, and efficiency.
- Build observability tooling for production ML pipelines (monitor latency, error rates, request volumes, drift).
- Automate model deployment, retraining, and evaluation pipelines (CI/CD for ML).
- Work with ML engineers to package models for serving.
- Help manage vector databases and semantic search infrastructure (e.g., Pinecone, FAISS, Vertex Matching Engine).
Other
- Can work cross-functionally with ML, security, and product teams to deploy safely and iterate fast.
- Brings a builder's mindset and bias for ownership in ambiguous environments.
- Fully remote, U.S.-based.
- Salary Range: $130K–$230K, depending on experience and location.
- 10a Labs is committed to building an inclusive, equitable workplace where diverse backgrounds, experiences, and perspectives are valued.