Lilly is looking to build a next-generation platform that leverages agentic AI and LLMs to accelerate drug discovery, from molecular hypothesis generation to experimental design optimization.
Requirements
- Proven experience as a senior software engineer or tech lead, with a strong foundation in backend architecture, microservices, and distributed systems—ideally in scientific computing platforms supporting multi-omics data or high-throughput biological assays.
- Expertise in Python and proficiency in one or more of the following: Node.js, Go, Java, or Rust. Experience with scientific computing libraries (e.g., NumPy, SciPy, Pandas, BioPython) and omics analysis frameworks is highly desirable.
- Hands-on experience building or scaling platforms in cloud-native environments (AWS, GCP, or Azure) using container orchestration (Kubernetes) and infrastructure-as-code tools like Terraform—preferably for computationally intensive biological workflows such as genome assembly or protein structure prediction.
- Familiarity with frontend development using React or similar frameworks, ideally applied to scientific data visualization or interactive analytics for complex biological datasets.
- Strong interest in applying software engineering to scientific discovery, which includes areas such as: LLM-powered scientific reasoning for hypothesis generation, literature mining, and protocol optimization
- AI-driven target identification and CRISPR screening analysis
- Real-time experimental design optimization using active learning
Responsibilities
- Design, implement, and scale key components of an agentic AI platform supporting drug discovery workflows, including target-disease association discovery, compound-protein interaction prediction, and multi-omics biomarker identification.
- Build modular backend services and orchestration layers that can support LLM-powered literature mining, real-time experimental planning, and tool-use chains over scientific data including single-cell RNA-seq, spatial transcriptomics, CRISPR screens, and proteomics datasets.
- Collaborate with AI scientists and computational biologists to integrate LLM frameworks (e.g., LangTorch, Semantic Kernel) with structured biological data sources to enable reasoning across multi-omics, assay data, protein-protein interaction networks, metabolic pathways, and experimental results from high-throughput screens.
- Develop intelligent interfaces using React or similar frameworks to support interactive, AI-guided workflows for target identification, pathway enrichment analysis, drug-target network exploration, CRISPR hit validation, and automated experimental protocol generation.
- Ensure data integrity, security, and traceability across workflows that handle omics, compound, assay data, and sensitive biological datasets including patient-derived samples, with proper provenance tracking for regulatory compliance.
- Lead development of scalable services deployed via Kubernetes and Terraform in cloud environments (AWS, GCP) optimized for high-throughput computational biology workloads including genome-wide association studies and molecular dynamics simulations.
- Apply CI/CD, test automation, and observability to enable robust, maintainable deployment pipelines for scientific computing environments supporting real-time experimental feedback and automated hypothesis testing.
Other
- Strong communication skills and a collaborative mindset for partnering with cross-functional teams including computational biologists, structural biologists, chemical biologists, and lab scientists.
- Intellectual curiosity and a growth mindset—especially an eagerness to deepen your understanding of systems biology, experimental design, and AI applications in drug discovery.
- Bachelor's or Master's degree in Computer Science, Software Engineering, Bioinformatics, Computational Biology, Systems Biology, or a related technical field with coursework in molecular biology, genetics, or biochemistry.
- 7+ years of experience in software engineering with a track record of delivering robust, scalable platforms, with demonstrated experience in scientific computing environments supporting biological data analysis or experimental workflows.
- Demonstrated ability to lead engineering projects from architecture to production, ideally in scientific or research environments involving complex biological datasets and multi-step experimental protocols.