GitLab is looking to evolve its strategic data platform to support enterprise-scale growth and innovation by architecting scalable, distributed solutions.
Requirements
- Experience architecting large-scale distributed data systems in complex, regulated domains with unified platforms integrating cloud-native compute, orchestration, and semantic modeling
- Hands-on expertise with modern data stack technologies including Python, Docker, Airflow, Trino, Postgres, distributed query engines, and graph-based metadata systems, integrating them into the GitLab ecosystem comprised of Ruby on Rails and Go services.
- Advanced knowledge bridging cloud and on-premises deployments with automation, developer self-service focus, and data integration through connector marketplaces
- Deep understanding of data processing paradigms and standards including synchronous vs. asynchronous processing, schema management, logical data modeling, and formats like OpenTelemetry, OpenMetadata, and OpenLineage
- Experience with AI-driven architectures and emerging technologies including model orchestration, agentic patterns, and standards like MCP (Model Context Protocol)
- Strong architectural opinions on cost-aware, resilient solutions that optimize entire data lifecycle decisions with focus on scalability and performance trade-offs
- Passion for open source platforms, team mentorship, and collaborative values with ability to build scalable solutions that align with organizational culture and technical excellence
Responsibilities
- Drive architectural vision for scalable, distributed data systems across SaaS and self-managed deployments, designing database stack solutions that optimize OLTP/OLAP performance and scalability requirements
- Define enterprise data product standards and governance frameworks including data lineage, SDLC, versioning, and compliance practices for regulated environments
- Build governed, monetizable data services and APIs that support both internal operations/analytics and external SaaS product offerings with semantic structure
- Partner with product and engineering teams to embed modern agentic and AI-driven patterns into data infrastructure and customer-facing solutions
- Architect event-driven systems and cross-stack orchestration supporting hybrid transformations through tools like Argo, Airflow, and Kubernetes with unified metadata-rich telemetry
- Design end-to-end data lifecycle architecture covering integration, pipelines, transformation workflows, and consolidated metadata systems across multiple platforms
- Establish CI/CD best practices for data systems ensuring reliable deployment, monitoring, and maintenance across diverse deployment models
Other
- Transform ambiguity into strategic roadmaps and lead complex technical engagements where data architecture creates competitive differentiation
- Demonstrated leadership building multi-modal data services with strong developer experience principles, focusing on monetization, governance, and data product lifecycle management
- Passion for open source platforms, team mentorship, and collaborative values
- The base salary range for this role’s listed level is currently for residents of listed locations only.
- GitLab is proud to be an equal opportunity workplace and is an affirmative action employer.