Hamming AI is looking to solve the problem of ensuring the reliability and scalability of their LLM-enabled platform for voice AI agents. They aim to automate QA for voice AI, secure these agents, and provide crisp bug reports and production-grade analytics.
Requirements
- Have senior/staff experience running distributed backends with real-time/streaming* constraints.
- Are fluent in TypeScript/Node.js and comfortable jumping into Python* for ML/audio jobs.
- Know Temporal (or similar workflow engines), queues, Redis, and PostgreSQL*.
- Have shipped production LLM apps* and understand prompt/tool design, evals, and guardrail instrumentation.
- Operate cloud-native on AWS with Terraform*; k8s doesn’t scare you.
- Are a power user of Cursor/Zed/Devin* and were using code-gen before it was cool.
- Have intuition for what current-gen LLMs can/can’t do—and what tomorrow’s models will unlock.
Responsibilities
- Own core services in TypeScript/Node.js and Python that orchestrate LiveKit, Temporal*, STT/TTS, and LLM tooling for real-time voice agents.
- Scale 1 N 100×: take what works today and harden it for 10K parallel calls with 99.99%* uptime. Turn human playbooks into productized systems.
- Harden pipelines* for ingestion, evaluation, and analytics so telephony events, recordings, and outcomes propagate reliably across services.
- Level-up observability: deepen OpenTelemetry/SigNoz* and trace-first practices to shrink mean-time-to-truth in prod.
- Prototype test prod*: partner with product to ship new LLM-driven behaviors with clear success metrics, guardrails, and regressions blocked in CI.
- Infrastructure readiness*: CI/CD, environment automation, incident response playbooks—customer conversations stay online.
Other
- Full-time (no contractors)
- Remote (North America) or Austin, TX
- Think independently, grind with customers*, and do whatever it takes—without dropping the quality bar.
- Outcomes over output*: we adjust roadmaps when new data lands.
- Demo early* and document decisions so context moves fast.