Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Senior Software Engineer, Observability

Together AI

$160,000 - $260,000

Dec 20, 2025

San Francisco, CA, US

Together AI is building the AI Acceleration Cloud, an end-to-end platform for the full generative AI lifecycle, combining the fastest LLM inference engine with state-of-the-art AI cloud infrastructure. The AI Infrastructure team is looking to design, implement, and maintain robust distributed storage solutions and comprehensive observability platforms to power their generative AI platform.

Requirements

5+ years of demonstrated experience in building large scale, fault tolerant, distributed systems and API microservices
Experience designing, analyzing and improving efficiency, scalability, and stability of various system resources
Demonstrated experience with building and operating high-performance and/or globally distributed microservice architectures across one or more cloud providers (AWS, Azure, GCP)

Responsibilities

Identify, design, and develop foundational backend services that power Together’s cloud platform
Analyze and improve the robustness and scalability of existing distributed systems, APIs, databases, and infrastructure
Partner with product teams to understand functional requirements and deliver solutions that meet business needs
Write clear, well-tested, and maintainable software and IaC for both new and existing systems
Conduct design and code reviews, create developer documentation, and develop testing strategies for robustness and fault tolerance
Participate in an on-call rotation to address critical incidents when necessary

Other

Excellent communication skills – able to write clear design docs and work effectively with both technical and non-technical team members