Amgen is looking to build and optimize its data infrastructure to empower data-driven decision making through advanced analytics and predictive modeling.
Requirements
- Proficient in SQL for extracting, transforming, and analyzing complex datasets from both relational and columnar data stores. Proven ability to optimize query performance on big data platforms.
- Proficient in leveraging Python, PySpark, and Airflow to build scalable and efficient data ingestion, transformation, and loading processes.
- Ability to learn new technologies quickly. Strong problem-solving and analytical skills.
- Experienced with SQL/NOSQL database, vector database for large language models
- Experienced with data modeling and performance tuning for both OLAP and OLTP databases
- Experienced with Apache Spark, Apache Airflow
- Experienced with software engineering best-practices, including but not limited to version control (Git, Subversion, etc.), CI/CD (Jenkins, Maven etc.), automated unit testing, and Dev Ops
Responsibilities
- Building and optimizing data pipelines, data warehouses, and data lakes on the AWS and Databricks platforms.
- Managing and maintaining the AWS and Databricks environments.
- Ensuring data integrity, accuracy, and consistency through rigorous quality checks and monitoring.
- Maintain system uptime and optimal performance
- Working closely with cross-functional teams to understand business requirements and translate them into technical solutions.
- Exploring and implementing new tools and technologies to enhance ETL platform performance.
Other
- Master’s degree OR Bachelor’s degree and 2 years of computer science or engineering experience OR Associate’s degree and 6 years of computer science or engineering experience OR High school diploma / GED and 8 years of computer science or engineering experience
- Excellent communication and teamwork skills.
- Ability to work effectively with global, virtual teams
- High degree of initiative and self-motivation.
- Ability to manage multiple priorities successfully.