CUNY SPS seeks a Senior Data Engineer to support NYC Opportunity's government-led research on poverty, income inequality, and social mobility by designing, building, and maintaining robust data pipelines and infrastructure to enable high-quality empirical research, with a primary focus on managing data collection, integration, and transformation activities for the NYCgov poverty measure.
Requirements
- Extensive experience with AWS cloud services: in particular, S3, Redshift, Lambda, ETL services such as Data Pipeline or Glue, CI/CD service (Code Pipeline), and CodeCommit.
- Proficiency in scripting languages, including R, Python, and Bash, is required.
- Certifications in cloud platforms, such as AWS Certified Solution Architect or Data Engineer, are preferred.
- Knowledge and experience with ML engineering is a plus
- Experience with database technologies such as MySQL, PostgreSQL, MongoDB, and Amazon Redshift.
- Experience with a version control system (preferably GitHub)
- Familiarity with administrative data or large datasets such as the American Community Survey, Current Population Survey, or the NYC Housing and Vacancy Survey.
Responsibilities
- Develop data architecture solutions to enable reproducible research and support long-term storage of historical and current data sets.
- Design and implement scalable ETL processes for large datasets from surveys, administrative sources, and third-party providers.
- Implement data validation routines to ensure data quality, accuracy, and reliability standards are met.
- Coordinate and manage data across income components and version control of poverty metrics.
- Streamlining the production of poverty research data to ensure fast access to metrics.
- Build and maintain various indicator data pipelines with city agencies and external partners, ensuring efficient data flow and integrity.
- Own the development and maintenance of data pipelines, including new data on economic security and risks
Other
- Must be able to work both independently and in a collaborative setting.
- Excellent verbal and written skills.
- Ability to work under occasional deadline pressure.
- Proven experience in architecting and managing data systems is required.
- Must have a Bachelor’s degree in computer science, engineering, information technology, data science, or a related field, plus a minimum of four years full-time experience as a data engineering role in designing building, and managing scalable and reliable data system or a cloud engineering role with specialization in cloud architects, automation, and cloud software development that build and maintain software features and functions, databases and applications for cloud technologies;