Biomedical Question Answering using a 'Farm' of Open Large Language Models

Panou, Dimitra; Dimopoulos, Alexandros; Reczko, Martin

doi:10.5281/zenodo.14621334

ScienceGate Book Chapters

JOURNAL ARTICLE

Biomedical Question Answering using a 'Farm' of Open Large Language Models

Panou, Dimitra Dimopoulos, Alexandros Reczko, Martin

Year: 2025 Journal: Zenodo (CERN European Organization for Nuclear Research) Publisher: European Organization for Nuclear Research

DOI: 10.5281/zenodo.14621334

Get Full-Text PDF Get Analytical Report

Abstract

Biomedical text mining and question-answering are essential and complex tasks driven by the need to access and process the ever-expanding volume of biomedicaldata. With the exponential growth of published biomedical literature, effective retrieval and accurate question-answering systems are crucial for researchers, clinicians, and medical experts to make well-informed decisions. The emergence of open-source Large Language Models (LLMs) marks a significant trend in the tech landscape, with these models increasingly tailored to address diverse tasks. In this work, we present our participation in the twelfth edition of the BioASQ challenge, which involves biomedical semantic question-answering for task 12b and biomedical question answering for developing topics for the Synergy task. We deploy a selection of open-source LLMs for embedding and retrieval of documents and snippets, as well as retrieval-augmented generators to answer biomedical questions. Dense retrieval methods, leveraging distances between dense representations of documents and questions obtained from LLM embeddings, and hybrid sparse/dense approaches, outperform traditional sparse retrieval methods in terms of mean average precision. We also implement a ’farm’ of open-source LLMs to provide exact answers to biomedical Yes/No type questions. A variety of models process the prompts, and a majority voting system combines their outputs to determine the final answer. Ideal answers, summarizing the most relevant information for each question type, are generated by the MIXTRAL LLM. In the four rounds of the 2024 BioASQ challenge, our system achieved notable results: 1st and 2nd place in two rounds for ’exact answers’, 2nd place in one round for ’documents ’ and 2nd place in one round for ’ideal answers’ in the Task Synergy, and in Task B we won the 1st place in one round for ’snippets’.

Keywords:

Question answering Variety (cybernetics) Process (computing) Selection (genetic algorithm) Language model Task (project management) Ideal (ethics) Embedding

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.24

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

Biomedical Text Mining and Ontologies

Life Sciences → Biochemistry, Genetics and Molecular Biology → Molecular Biology

Artificial Intelligence in Healthcare and Education

Health Sciences → Medicine → Health Informatics

Biomedical Question Answering using a 'Farm' of Open Large Language Models

Abstract

Metrics

Topics

Related Documents

Biomedical Question Answering using a 'Farm' of Open Large Language Models

Enhancing Biomedical Question Answering with Large Language Models

Open-Domain Question Answering over Tables with Large Language Models

Knowledge Enhanced Industrial Question-Answering Using Large Language Models

Leveraging Large Language Models and Knowledge Graphs for Advanced Biomedical Question Answering Systems