The Baseball Data Platform team at MLB aims to capture the story of baseball through its data by supporting customers, improving data quality, and delivering insights from play-by-play and Statcast tracking data.
Requirements
- 5+ years of experience with data related to baseball
- Substantive knowledge of SQL or similar database querying experience
- Experience using R and/or Python for data analysis in a professional setting
- Knowledge of data visualization software like Looker or Tableau preferred
- Familiarity with baseball tracking systems, especially Statcast, is preferred
- Familiarity with data pipeline tools and best practices for scheduled processing and ETL/ELT is a plus
Responsibilities
- Investigate potential data quality issues both proactively and responsively
- Lead large data initiatives focused on strategic R&D goals, including increasing the scalability of alerting and resolving of outlier data
- Implement high impact process changes with our support team based on internal operations data and logging
- Achieve comprehensive expertise in Baseball Data Platform. This includes knowing the schema, data lineage, and optimal usage patterns of all critical tables (e.g., pitch-by-pitch data, tracking data, weather data, etc.)
- Build optimized queries and statistical reports/models, as appropriate, to identify and detect trends and present findings to MLB and Club personnel
- Design and produce reports delivering insights
- Test/validate new data and metrics, and contribute input to their development
Other
- Experience communicating professionally in a customer support role with technical and non-technical users
- Critical thinking skills and the ability to apply analytical insights to create positive change
- Self-motivated to seek out previously unidentified problems, resolve, and then scale them
- Highly collaborative in nature and team-oriented
- Proactive communicator with internal and external stakeholders