The company is seeking a Python Developer to support data ingestion, validation, and performance optimization within the US healthcare domain.
Requirements
- Strong hands-on experience with Python, PySpark, and SQL
- Experience working on Cloudera Hadoop or similar platforms
- Knowledge of HDFS, HBase, and Hive
- Experience with unit testing and version control systems (Git, Subversion)
- Solid understanding of data engineering principles
- Java experience with Spring Boot, Drools, or similar rule engines
- Experience handling multiple data formats: JSON, YAML, CSV, tab-delimited files
Responsibilities
- Develop tools and techniques to improve data performance and process efficiency
- Review, validate, and test data for accuracy and integrity prior to loading into the data lake
- Perform troubleshooting and root-cause analysis of data and application issues
- Support and deliver business priorities in a SAFe Agile environment
Other
- 3–8 years of hands-on experience building data pipelines, analytics solutions, and web applications
- Preferably within the US healthcare domain
- Ability to work in a Scaled Agile Framework (SAFe) environment
- Familiarity with Jira and Confluence
- Experience with CI/CD tools such as Bamboo and deployment preparation