ResMed SaaS Holdings, Inc. is looking to solve the problem of creating and maintaining optimal data pipeline architecture to meet functional and non-functional business requirements
Requirements
- creating and running Python
- Python coding or Pyspark jobs working within AWS Glue
- building and working with AWS Data Lakes
- working on GitHUB CI/CD processes
- big data tools, including Snowflake, Hadoop, Spark, and Kafka
- data pipeline and workflow management tools
- AWS cloud services, including EMR, RDS, and Redshift
- DBT, Coalesce or similar ELT tools
Responsibilities
- Create and maintain optimal data pipeline architecture
- Assemble large, complex data sets that meet functional and non-functional business requirements
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, and re-designing infrastructure for greater scalability
- Build infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using Python and AWS ‘big data’ technologies
- Work with stakeholders including Executive, Product, Data and Design teams to assist with data-related technical issues and support data infrastructure needs
- Keep data separated and secure across national boundaries through multiple data centers and AWS regions
- Work on data tools for analytics and data scientist team members that assist in building and optimizing product into an innovative industry leader
Other
- Master’s degree or equivalent in Computer Science, Statistics, Informatics, Information Systems or related quantitative field of study
- 4 years of experience in a full cycle Data Engineering role
- 100% remote position reporting to: San Diego, CA
- comprehensive medical, vision, dental, and life, AD&D, short-term and long-term disability insurance
- Employees accrue three weeks Paid Time Off (PTO) in their first year of employment