Capgemini is looking to hire individuals who can help the world's leading organizations unlock the value of technology and build a more sustainable, more inclusive world. This role specifically focuses on leveraging data and cloud technologies to achieve these goals.
Requirements
- In-depth knowledge of Pyspark, DataFrames, Datasets, RDDs, Spark Streaming and Parquet partitioning concepts.
- Experience with Data bricks and AWS S2, AWS Athena, AWS Glue. Should have understanding for AWS Athena.
- Strong skills in Python, PySpark, SQL, and experience with data warehousing and ETL/ELT processes.
- Proficient in cloud technologies and have experience with data governance and security.
- Data Python (must avoid Application Python)
- Strong experience with SQL.
- Knowledge of version control (Git), CI/CD tools, and Agile methodologies.
Responsibilities
- In-depth knowledge of Pyspark, DataFrames, Datasets, RDDs, Spark Streaming and Parquet partitioning concepts.
- Experience with Data bricks and AWS S2, AWS Athena, AWS Glue. Should have understanding for AWS Athena.
- Strong skills in Python, PySpark, SQL, and experience with data warehousing and ETL/ELT processes.
- Proficient in cloud technologies and have experience with data governance and security.
- Data Python (must avoid Application Python)
- Strong experience with SQL.
- Knowledge of version control (Git), CI/CD tools, and Agile methodologies.
Other
- Excellent communication.
- Should have the ability to train and lead the developers within the team, for any of the above skill gap.
- Should be ready to work from office all 5 days.