Hearst Newspapers (HNP) is investing heavily in digital experiences, data engineering, and machine learning to power next-generation news products. The company is looking for a Principal Data Engineer to architect and build data pipelines for production ML and real-time applications to support these initiatives.
Requirements
- Proficiency in Python and SQL, with proven experience with DBT, Airflow, and cloud data platforms (GCP preferred, AWS or Azure a plus).
- Deep understanding of data modeling, ELT/ETL frameworks, and streaming solutions (e.g., Spark, Daft, Flink, Pub/Sub, Kafka, etc.).
- Experience designing and optimizing complex data architectures, including ML pipelines and near-real-time analytics solutions, as well as super high volume reporting applications.
- Leveraging Python, SQL, DBT, Airflow on GCP.
- Tools like Spark, Daft, Bedrock, Pub/Sub, Flink, or similar.
- ML pipelines, packaging, deploying, and monitoring models.
- Data pipelines to power data products and reporting tools.
Responsibilities
- Architect and build data pipelines for production ML and real-time applications (think graph-based recommendation engines, real-time customer scoring, and classification models for customer segmentation).
- Design and implement high-volume data ingestion, transformation, and orchestration solutions leveraging Python, SQL, DBT, Airflow on GCP.
- Drive the adoption of data processing using tools like Spark, Daft, Bedrock, Pub/Sub, Flink, or similar.
- Partner with data scientists to productionize ML pipelines, packaging, deploying, and monitoring models.
- Build and maintain data pipelines to power data products and reporting tools.
- Own the architectural design, development, and maintenance of data pipelines.
- Design data solutions with an emphasis on high availability, low latency, and scalability.
Other
- Hybrid in New York City
- 6–10 years of professional data engineering experience, with a track record of building production-grade pipelines and real-time data applications.
- Comfortable working in a fast-paced, outcome-oriented environment where you’ll tackle complex problems and get your hands dirty.
- Excellent communication skills and ability to mentor other engineers while also collaborating effectively with non-technical stakeholders.
- Familiarity with consumer products and/or advertising data models is a plus, but we prioritize skills over background.