Veeva Systems is looking for a Senior Data Scientist to join their US Veeva OpenData Product team to build complex methods, analytics, and data that support Veeva's Reference Data product, OpenData. The role will execute on the roadmap for data improvements, evaluate discrepancies, and share critical insights to inform customer and product improvement efforts, ultimately contributing to the success of Veeva's customers and the product itself.
Requirements
- Ability to use PySpark to code against large datasets.
- Jupyter Labs experience
- AWS knowledge are essential
- 10+ years of data science experience within life sciences, specifically in data integration, data management, and advanced analytics applied to RWE data
Responsibilities
- Lead critical projects that contribute to the underlying OpenData reference and affiliation data assets by leveraging your advanced data science programming and analytics skills
- Leverage your deep domain knowledge in how providers and facilities are identified, tracked, and represented in data to infuse OpenData with the most accurate possible representation of location and relationship
- Partner with the OpenData Engineering team to create a scaled web scraping framework that enables Veeva OpenData to track providers and facilities
- Collaborate with internal stakeholders across Veeva’s product portfolio to assist with interoperability between OpenData and other products.
- Maintain detailed documentation of all developments that are incorporated into the Veeva OpenData pipeline
Other
- Excellent verbal and written communication skills, with the ability to clearly convey technical information to non-technical stakeholders
- Strong analytical skills with the ability to quickly identify issues and work collaboratively to develop solutions
- Experience in the Life Sciences industry is a plus, especially with respect to commercial data systems, RWE, provider and facility reference and affiliation data