SHEIN Technology LLC is seeking a Big Data Engineer I to support US-based Big Data operations, refactor and redesign, while partnering with an established global organization. The role aims to build and deploy highly scalable data pipelines, adhere to software/data engineering best practices, and ensure the security and quality of data.
Requirements
- Building and optimizing large-scale, distributed data pipelines with Hive, Presto, Spark, or Flink.
- Data warehousing, including dimensional modeling, star/snowflake schema design, and normalization/denormalization strategies in large-scale data warehouses including Amazon Redshift.
- Writing and optimizing complex SQL queries for large datasets, creating joins, aggregations, and subqueries, in the context of querying data warehouses.
- Data storage solutions, including S3 on AWS.
- Cloud-native services including AWS EMR, AWS S3.
- Using workflow orchestration tools including Airflow in a production environment to automate, schedule, monitor and tune, data pipelines.
Responsibilities
- Build and deploy highly scalable data pipelines to move and transform data and ensure the security and quality of data.
- Optimize and maintain all domain-related data pipelines.
- Partner in a cross-functional global organization across data, security, infrastructure, and business teams to understand data needs.
- Implement inclusive data quality checks to ensure high quality of data.
- Implement and enforce data security policies and ensure compliance with relevant regulations and standards.
- Provide 24x7 on-call support on a rotational basis.
Other
- Master’s degree or a foreign equivalent in Applied Data Science, Computer Science, or a related field, plus 1 year of post-baccalaureate experience in job offered or any related job titles.