Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Together AI Logo

Senior Software Engineer, Observability

Together AI

$160,000 - $260,000
Dec 20, 2025
San Francisco, CA, US
Apply Now

Together AI is building the AI Acceleration Cloud, an end-to-end platform for the full generative AI lifecycle, combining the fastest LLM inference engine with state-of-the-art AI cloud infrastructure. The AI Infrastructure team is looking to design, implement, and maintain robust distributed storage solutions and comprehensive observability platforms to power their generative AI platform.

Requirements

  • 5+ years of demonstrated experience in building large scale, fault tolerant, distributed systems and API microservices
  • Experience designing, analyzing and improving efficiency, scalability, and stability of various system resources
  • Demonstrated experience with building and operating high-performance and/or globally distributed microservice architectures across one or more cloud providers (AWS, Azure, GCP)

Responsibilities

  • Identify, design, and develop foundational backend services that power Together’s cloud platform
  • Analyze and improve the robustness and scalability of existing distributed systems, APIs, databases, and infrastructure
  • Partner with product teams to understand functional requirements and deliver solutions that meet business needs
  • Write clear, well-tested, and maintainable software and IaC for both new and existing systems
  • Conduct design and code reviews, create developer documentation, and develop testing strategies for robustness and fault tolerance
  • Participate in an on-call rotation to address critical incidents when necessary

Other

  • Excellent communication skills – able to write clear design docs and work effectively with both technical and non-technical team members