Build data pipelines using Python and PySpark to import and sync data from various sources, clean and standardize data, and fix data quality issues. Collect, manage, and convert raw data into usable information for clients, data scientists, and business analysts to interpret. Build processes for users and data owners to interact with their data in platform, using a variety of technologies, including Python, TypeScript, and JavaScript. Develop innovative solutions to complex data-driven problems.
Requirements
- Experience with Python
- Experience with data integration and management
- Knowledge of fundamental principles of programming or web development
- Experience with Palantir Foundry
- Experience with R
- Experience with JavaScript, TypeScript, HTML, and CSS
- Experience with version control systems, including GIT
- Knowledge of PySpark
Responsibilities
- Build data pipelines using Python and PySpark.
- Import and sync to various data sources, clean and standardize data elements, and fix data quality issues.
- Collect, manage, and convert raw data into usable information for clients, data scientists, and business analysts to interpret.
- Use code authoring tools in designated platforms to make data accessible so that organizations can use it to evaluate and optimize policy.
- Build processes for users and data owners to interact with their data in platform, using a variety of technologies, including Python, TypeScript, and JavaScript.
- Develop innovative solutions to complex data-driven problems and operate with substantial latitude for unreviewed action or decision.
Other
- 2+ years of experience working in a professional environment
- Ability to obtain and maintain a Public Trust or Suitability/Fitness determination based on client requirements
- Experience working in a public health environment
- Knowledge of Agile and Scrum processes
- Applicants selected will be subject to a government investigation and may need to meet eligibility requirements of the U.S. government client.