Stand Together is seeking a Data Engineering Manager to operationalize support workflows, incident response protocols, and data observability practices across their data platforms, ensuring reliability and efficiency in data operations.
Requirements
- 5+ years of experience in data engineering, with exposure to DevOps and/or DataOps, operational support, and incident management.
- Strong understanding of data pipeline architecture, ETL/ELT processes, and cloud data tools (e.g., Snowflake, Databricks, Fivetran, dbt, AWS/GCP).
- Familiarity with observability tools and practices (e.g., logging, tracing, metrics).
- Experience with incident management frameworks (e.g., ITIL, SRE).
- Hands-on experience with data quality and data observability tools.
- Exposure to agile methodologies and DevOps practices.
- Passion for building reliable, scalable, and user-friendly data systems.
Responsibilities
- Manage a small team of data engineers focused on operational support, reliability, and platform engineering.
- Be hands-on in writing production-grade code to support data product development, collaborating directly with data engineers and architects.
- Design and implement scalable support processes for data pipelines and platforms.
- Own and evolve incident response playbooks, ensuring rapid and effective resolution.
- Lead efforts to enhance data observability using tools like Monte Carlo, Bigeye, or similar.
- Define and track key metrics for data health, latency, and reliability.
- Identify opportunities to automate repetitive support tasks and incident triage.
Other
- 1+ years of experience in a leadership role.
- Excellent problem-solving skills and a bias toward action.
- Strong communication and organizational skills.
- Enthusiasm to contribute to Stand Together's vision and principled approach to solving problems, and a commitment to stewarding our culture, which champions values including transformation and innovation, entrepreneurialism, humility, and respect.
- Communicate effectively with business and technical stakeholders regarding incidents and resolutions.