The company is looking for a Senior Data Engineer to lead the design and implementation of their data analytics infrastructure, ensuring data systems are robust, scalable, and aligned with organizational goals.
Requirements
- Hands-on experience with AWS services such as EMR, S3, Athena, and MSK.
- Proficiency in Python and PySpark for data engineering tasks.
- Deep knowledge of data storage formats like Parquet and Iceberg.
- Solid understanding of data streaming and real-time processing with Kafka.
- Proven ability to design and implement scalable data architectures.
- Familiarity with Kubernetes and relevant AWS practices.
- Experience with orchestration tools like Dagster.
Responsibilities
- Design end-to-end data solutions to support analytics, machine learning, and real-time data processing.
- Build and maintain ETL/ELT pipelines using PySpark, Apache Iceberg, and Dagster (Python).
- Optimize pipelines for performance, scalability, and cost-effectiveness.
- Manage and enhance AWS-based data infrastructure, including EMR, S3, Athena, and MSK (Kafka).
- Develop data storage and query performance solutions, leveraging technologies like Parquet and Iceberg.
- Evaluate existing and new technologies to incorporate into our data platform.
- Troubleshoot complex data-related issues and ensure system reliability and data quality.
Other
- Collaborate with stakeholders to define technical strategies to optimize performance and processes and align data solutions with business needs.
- Mentor team members and promote best practices in data engineering.
- Work closely with data analysts, and business teams to deliver data-driven solutions.
- Strong problem-solving skills and experience with big-picture decision-making.
- 5+ years of experience in data engineering or a related field.