JOURNAL ARTICLE

Biomedical Question Answering using a 'Farm' of Open Large Language Models

Panou, DimitraDimopoulos, AlexandrosReczko, Martin

Year: 2025 Journal:   Zenodo (CERN European Organization for Nuclear Research)   Publisher: European Organization for Nuclear Research

Abstract

Biomedical text mining and question-answering are essential and complex tasks driven by the need to access and process the ever-expanding volume of biomedicaldata. With the exponential growth of published biomedical literature, effective retrieval and accurate question-answering systems are crucial for researchers, clinicians, and medical experts to make well-informed decisions. The emergence of open-source Large Language Models (LLMs) marks a significant trend in the tech landscape, with these models increasingly tailored to address diverse tasks. In this work, we present our participation in the twelfth edition of the BioASQ challenge, which involves biomedical semantic question-answering for task 12b and biomedical question answering for developing topics for the Synergy task. We deploy a selection of open-source LLMs for embedding and retrieval of documents and snippets, as well as retrieval-augmented generators to answer biomedical questions. Dense retrieval methods, leveraging distances between dense representations of documents and questions obtained from LLM embeddings, and hybrid sparse/dense approaches, outperform traditional sparse retrieval methods in terms of mean average precision. We also implement a ’farm’ of open-source LLMs to provide exact answers to biomedical Yes/No type questions. A variety of models process the prompts, and a majority voting system combines their outputs to determine the final answer. Ideal answers, summarizing the most relevant information for each question type, are generated by the MIXTRAL LLM. In the four rounds of the 2024 BioASQ challenge, our system achieved notable results: 1st and 2nd place in two rounds for ’exact answers’, 2nd place in one round for ’documents ’ and 2nd place in one round for ’ideal answers’ in the Task Synergy, and in Task B we won the 1st place in one round for ’snippets’.

Keywords:
Question answering Variety (cybernetics) Process (computing) Selection (genetic algorithm) Language model Task (project management) Ideal (ethics) Embedding

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.24
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Biomedical Text Mining and Ontologies
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
Artificial Intelligence in Healthcare and Education
Health Sciences →  Medicine →  Health Informatics
© 2026 ScienceGate Book Chapters — All rights reserved.