Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Sr Data Scientist I

Advarra

$130,000 - $168,000

Oct 24, 2025

Remote, US

Advarra is looking to optimize, evaluate, and operationalize advanced machine learning models within its Braid platform to enhance precision, recall, and contextual relevance across clinical and operational data, thereby accelerating trials and improving the clinical research ecosystem.

Requirements

5+ years of hands-on experience developing and fine-tuning ML or LLM models
Demonstrated expertise in Python, with experience and knowledge of a commercial framework like PyTorch
Hands-on experience developing, managing, and troubleshooting workflows within Databricks for data engineering, analytics, and machine learning projects
Documented strong understanding of the ML lifecycle
Experience with embeddings and retrieval-augmented generation (RAG)
Hands-on fluency in Databricks notebooks for exploratory analysis, model development, and workflow orchestration.
Experience with causal inference, simulation modeling, or graph-based reasoning applied to clinical development or biomedical research.

Responsibilities

Focus on understanding existing models, assessing their performance, selecting optimal architectures, and fine-tuning them to meet specific domain and business needs—including retrieval-augmented generation (RAG) based applications.
Optimize and fine-tune large language models (LLMs) and domain-specific variants using proprietary datasets to achieve precision and recall targets that drive differentiated customer value.
Evaluate model performance across key metrics and benchmarks, identifying strengths, weaknesses, and opportunities for improvement across predictive, generative, and retrieval-augmented tasks.
Implement and operationalize LLM-based and retrieval-augmented (RAG) systems that enhance Braid-powered products such as Study Design and Site Feasibility.
Collaborate with data engineering to ensure scalable, efficient model training, evaluation, and deployment pipelines using Databricks, MLflow, and Delta Lake.
Assess and select models—open-source or proprietary—that best align with domain-specific requirements and Advarra’s regulated research environment.
Conduct model interpretability and bias analyses to ensure fairness, transparency, and compliance with governance standards.

Other

Collaborate closely with data engineering, product, and domain teams to translate real-world research challenges into scalable, model-driven solutions that accelerate Advarra’s vision of a digitally connected research data and technology fabric.
Partner with clinical and operational experts to translate research and trial challenges into measurable model evaluation frameworks and optimization strategies.
Document methodologies and validation results to support internal governance, reproducibility, and audit readiness.
Contribute to reusable fine-tuning workflows, evaluation frameworks, and model monitoring pipelines within the Braid AI stack.
Stay at the forefront of advancements in LLM optimization, retrieval augmentation, and multi-modal learning, applying new methods that improve scalability, explainability, and cost efficiency