Discord needs to improve its data infrastructure to support its growing user base and gaming community.
Requirements
- 2+ years of experience building data pipelines in production with deep knowledge of performant scalable patterns
- 2+ years of experience in designing, developing, and maintaining robust data models from structured and unstructured sources
- 2+ years of experience writing accurate and effective code in SQL and Python
- Experience implementing and monitoring audits for data quality with massive data sets (e.g. billions of rows)
- Experience with modern data storage and processing technologies (i.e. BigQuery SQL, Looker, Airflow, and DBT or similar)
- Experience with designing data architecture to power a variety of use cases, including experimentation
- Experience with advertising products and third-party data ingestion is a strong plus
Responsibilities
- Create and maintain data pipelines and foundational datasets to support analytics, modeling, experimentation, and product/business needs
- Design and build database architectures with massive and complex data, balancing ergonomic benefits with computational load and cost
- Develop audits for data quality at scale, implementing alerting and anomaly detection as necessary
- Create scalable dashboards and reports to support business objectives and enable data-driven decision making
- Partner with data scientists, engineers, and product teams to accomplish data-related tasks
- Improve the coverage, accuracy, and reliability of instrumentation
- Proactively identify opportunities to improve ETL & dashboard performance and cost
Other
- 2+ years of experience
- A desire to work with amazing, passionate people who care deeply about solving challenging problems to improve Discord
- A collaborative attitude and a healthy dose of natural curiosity
- Excellent communication skills to thrive in ambiguous environments where problems are not well-defined and evolve quickly
- US based only
- Passion for Discord or online communities is a plus