tvScientific is looking to implement robust data infrastructure to power their data-heavy company, specifically to build out core data pipelines, store data in optimal engines and formats, and feed machine learning models.
Requirements
- Proven experience building data infrastructure using Spark with Scala for at least 3 years.
- Familiarity with data lakes, cloud warehouses, and storage formats.
- Strong proficiency in AWS services.
- Expertise in SQL for data manipulation and extraction.
- Familiarity with Elastic Map Reduce (EMR).
- Familiarity with data table formats like Apache Iceberg, Delta.
Responsibilities
- Design and implement robust data infrastructure using Spark with Scala.
- Build out our core data pipelines, store data in optimal engines and formats, and feed our machine learning models.
- Leverage and optimize AWS resources.
- Collaborate closely with the Data Science team.
- Implement automated data quality checks.
- Design data solutions that meet business needs.
Other
- Minimum of 5 years of full-time experience in data engineering.
- Excellent written and verbal communication skills.
- Experience in adtech.
- Previous experience building out a Data Engineering function.
- Proven experience working closely with Data Science teams on machine learning pipelines.