Airtable aims to ensure that its engineers have the tools necessary to measure performance, monitor reliability, and debug issues in real time, especially as it scales its observability systems and extends them to AI and LLM features.
Requirements
- 6+ years of software engineering experience, with 3+ years focused on observability, or infrastructure at scale
- Demonstrated success implementing and running production-grade logging, metrics, or tracing systems
- Proficiency in distributed systems concepts, data streaming pipelines, and container orchestration (Kubernetes)
- Deep hands-on knowledge of tools such as Prometheus, Grafana, Datadog, OpenTelemetry, ELK Stack, Loki, or ClickHouse
- Comfort with at least one programming language (e.g., Go, Python, Java) to build and maintain observability tooling
- Experience mentoring engineers and collaborating across multiple teams
- Strong communication skills to effectively present technical trade-offs and architectural plans
Responsibilities
- Lead the design and evolution of logging, metrics, and tracing pipelines to handle massive data volumes
- Evaluate and integrate new technologies (e.g., OpenTelemetry, ClickHouse, ELK stack) that enhance Airtable’s observability posture
- Guide and mentor a growing team of infrastructure engineers; share best practices in distributed tracing, monitoring, and logging
- Define and uphold coding standards and operational excellence across the org
- Partner with Deploy Infrastructure, Service Orchestration, and Product teams to embed observability throughout the development lifecycle
- Own end-to-end reliability for observability tools and establish SLAs, SLOs, and error budgets
- Optimize performance and cost of large-scale data pipelines and storage
Other
- Eagerness to own high-impact initiatives from design through production and maintenance
- Proven ability to balance short-term fixes with long-term strategic vision
- A passion for enabling all of Airtable’s engineering organization through reliable, intuitive observability tools
- Commitment to measuring success by the velocity and confidence with which product teams can ship
- Comfort with travel if required for work location considerations