Design, build, and implement generalized large-scale, sophisticated data pipelines using Nifi for downstream analytics and data science for Sportradar's Sport Performance products.
Requirements
- Python, Java, Kafka, AWS, and Docker
- ETL Development and Warehousing
- Analytics and debugging using SQL
- Designing data architecture
- Utilize Java language to build data processors in Nifi framework.
- Utilize Docker to ensure a consistent, repeatable, and isolated environment for software development and testing.
Responsibilities
- Design, build, and implement generalized large-scale, sophisticated data pipelines using Nifi for downstream analytics and data science for our Sport Performance products.
- Design and develop scalable Nifi ingestion pipelines within AWS cloud services to consume real-time and batch data from external sources.
- Design and create data models for use throughout the ETL system.
- Utilize Kafka to efficiently and to effectively store data to move throughout the data pipeline and for downstream data science and analytics usage.
- Build the data transforms within the data pipeline to convert data from external to internal representations.
- Conduct data analytics and debugging bad data by writing SQL queries.
- Build automated cleaning of data to remove bad or unusable data from downstream consumers with logging to understand the frequency and depth of the underlying issues.
Other
- Position permits telecommuting from anywhere in the U.S.
- Master’s degree in computer science, computer engineering, or closely related field
- 1 year experience as a data engineer or related occupation
- Agile development environment
- Work in a self-driven, independent fashion to meet Sport driven deadlines.