The University of Wisconsin School of Medicine and Public Health is looking to leverage real-world data, including Electronic Health Record (EHR) data, to support groundbreaking data-driven research and generate real-world evidence to improve patient health outcomes.
Requirements
- Extensive experience in NLP and machine learning with a focus on healthcare data, including expertise across the NLP spectrum from classic libraries (e.g., spaCy, NLTK) to modern LLM frameworks (e.g., Hugging Face Transformers, LangChain).
- Proficiency in Python and deep learning frameworks (PyTorch/PyTorch Lightning or TensorFlow/Keras).
- Hands-on experience navigating and analyzing complex Electronic Health Record (EHR) data, including both structured fields (e.g., diagnoses, labs) and unstructured clinical notes.
- Knowledge of medical terminologies and coding systems (e.g., ICD-10, CPT), healthcare data standards (e.g., HL7, FHIR), and regulations (e.g., HIPAA)
- A strong foundation in biostatistics, particularly as it applies to observational research, study design, and causal inference.
- Experience packaging data science solutions into reproducible tools, APIs, or applications for use by a broader research community.
- Stay current with the latest advancements in AI, NLP, and healthcare analytics to propose and implement innovative solutions to complex research challenges.
Responsibilities
- Ensure data quality and integrity through rigorous preprocessing, cleaning, and validation techniques.
- Conduct exploratory data analysis to uncover insights and inform model development.
- Design, develop, validate, and deploy advanced machine learning and NLP models (including LLMs) to extract critical insights from unstructured EHR data, ensuring the reliability and accuracy of these models for research environments.
- Build predictive models to forecast disease risk and progression using data extracted from notes, health outcomes and treatment effectiveness translating model outputs into actionable scientific and clinical insights.
- Collaborate with cross-functional teams, including data engineers, clinicians, and product managers, to integrate NLP solutions into healthcare research.
- Communicate results and insights effectively to both technical and non-technical stakeholders.
- Document methodologies, processes, and findings comprehensively.
Other
- The ideal candidate is curious, self-motivated, and able to work with minimal supervision.
- The selected candidate must be committed to supporting our research community and upholding SMPH core values of respect, integrity, teamwork, excellence, leadership, innovation, collaboration, accountability, and communication.
- Strong analytical, critical-thinking, and problem-solving skills with a demonstrated ability to work both independently and in a collaborative team environment.
- Excellent written, verbal, and presentation communication skills, evidenced by a strong track record of scientific publications.
- Detail-oriented with a commitment to data accuracy and integrity.