Assembled is looking for a software engineer to join their Infrastructure team to build and operate the core systems powering their rapidly growing AI agent platform for customer support. The platform automates support workflows and has seen significant ARR growth, necessitating investment in scaling its infrastructure to meet increasing demand and enterprise security expectations.
Requirements
- Experience with distributed systems and container orchestration (especially Kubernetes)
- Experience with AI/ML platforms or are excited to build foundational infrastructure for LLM-based applications
- AWS
- Kubernetes + Karpenter
- Experience with OpenAI, Anthropic, or open-source model serving (e.g., vLLM, HuggingFace TGI, Ray Serve)
- Vector databases (e.g., Pinecone, Weaviate, PGVector), semantic search, prompt templating systems
- Postgres + PgBouncer, Snowflake, Redis
Responsibilities
- Manage and scale the infrastructure that serves LLM-powered agents across chat, email, and voice.
- Select inference strategies, integrate with model providers (e.g. OpenAI, Anthropic), and dynamically route traffic for performance and cost efficiency.
- Own highly-available, fast-access storage and indexing layers optimized for real-time AI interactions.
- Build systems like network-level intrusion detection (IDS/IPS), audit logging, and LLM usage policy enforcement.
- Operate systems that surface key metrics—token usage, latency, cost per response, and quality signals.
- Explore and evangelize the use of AI to accelerate internal engineering workflows.
- Work on foundational platform components that power real-time LLM usage at scale.
Other
- 6+ years of engineering experience, with past ownership of high-scale, production-critical infrastructure
- Thrive in fast-paced environments with shifting requirements and ambiguous problem spaces
- Are motivated by impact, enjoy deep technical challenges, and want to work cross-functionally across security, AI, and product
- Go and Python
- Datadog, Mezmo, CloudWatch, Buildkite, CircleCI