Goto

Collaborating Authors

 medical decision-making


TeamMedAgents: Enhancing Medical Decision-Making of LLMs Through Structured Teamwork

arXiv.org Artificial Intelligence

Building upon Salas et al.'s "Big Five" teamwork model, we operationalize five core components as independently configurable mechanisms: shared mental models, team leadership, team orientation, trust networks, and mutual monitoring. Our architecture dynamically recruits 2-4 specialist agents and employs structured four-phase deliberation with adaptive component selection. Evaluation across eight medical benchmarks encompassing 11,545 questions demonstrates TeamMedAgents achieves 77.63% overall accuracy (text-based: 81.30%, vision-language: 66.60%). Systematic ablation studies comparing three single-agent baselines (Zero-Shot, Few-Shot, CoT) against individual teamwork components reveal task-specific optimization patterns: shared mental models excel on knowledge tasks, trust mechanisms improve differential diagnosis, while comprehensive integration degrades performance. Adaptive component selection yields 2-10 percentage point improvements over strongest baselines, with 96.2% agent convergence validating structured coordination effectiveness. TeamMedAgents establishes principled methodology for translating human teamwork theory into multi-agent systems, demonstrating that evidence-based collaboration patterns enhance AI performance in safety-critical domains through modular component design and selective activation strategies.


Mediator-Guided Multi-Agent Collaboration among Open-Source Models for Medical Decision-Making

arXiv.org Artificial Intelligence

Complex medical decision-making involves cooperative workflows operated by different clinicians. Designing AI multi-agent systems can expedite and augment human-level clinical decision-making. Existing multi-agent researches primarily focus on language-only tasks, yet their extension to multimodal scenarios remains challenging. A blind combination of diverse vision-language models (VLMs) can amplify an erroneous outcome interpretation. VLMs in general are less capable in instruction following and importantly self-reflection, compared to large language models (LLMs) of comparable sizes. This disparity largely constrains VLMs' ability in cooperative workflows. In this study, we propose MedOrch, a mediator-guided multi-agent collaboration framework for medical multimodal decision-making. MedOrch employs an LLM-based mediator agent that enables multiple VLM-based expert agents to exchange and reflect on their outputs towards collaboration. We utilize multiple open-source general-purpose and domain-specific VLMs instead of costly GPT-series models, revealing the strength of heterogeneous models. We show that the collaboration within distinct VLM-based agents can surpass the capabilities of any individual agent. We validate our approach on five medical vision question answering benchmarks, demonstrating superior collaboration performance without model training. Our findings underscore the value of mediator-guided multi-agent collaboration in advancing medical multimodal intelligence.


A Knowledge-driven Adaptive Collaboration of LLMs for Enhancing Medical Decision-making

arXiv.org Artificial Intelligence

Medical decision-making often involves integrating knowledge from multiple clinical specialties, typically achieved through multidisciplinary teams. Inspired by this collaborative process, recent work has leveraged large language models (LLMs) in multi-agent collaboration frameworks to emulate expert teamwork. While these approaches improve reasoning through agent interaction, they are limited by static, pre-assigned roles, which hinder adaptability and dynamic knowledge integration. To address these limitations, we propose KAMAC, a Knowledge-driven Adaptive Multi-Agent Collaboration framework that enables LLM agents to dynamically form and expand expert teams based on the evolving diagnostic context. KAMAC begins with one or more expert agents and then conducts a knowledge-driven discussion to identify and fill knowledge gaps by recruiting additional specialists as needed. This supports flexible, scalable collaboration in complex clinical scenarios, with decisions finalized through reviewing updated agent comments. Experiments on two real-world medical benchmarks demonstrate that KAMAC significantly outperforms both single-agent and advanced multi-agent methods, particularly in complex clinical scenarios (i.e., cancer prognosis) requiring dynamic, cross-specialty expertise. Our code is publicly available at: https://github.com/XiaoXiao-Woo/KAMAC.


MedGellan: LLM-Generated Medical Guidance to Support Physicians

arXiv.org Artificial Intelligence

Medical decision-making is a critical task, where errors can result in serious, potentially life-threatening consequences. While full automation remains challenging, hybrid frameworks that combine machine intelligence with human oversight offer a practical alternative. In this paper, we present MedGellan, a lightweight, annotation-free framework that uses a Large Language Model (LLM) to generate clinical guidance from raw medical records, which is then used by a physician to predict diagnoses. MedGellan uses a Bayesian-inspired prompting strategy that respects the temporal order of clinical data. Preliminary experiments show that the guidance generated by the LLM with MedGellan improves diagnostic performance, particularly in recall and $F_1$ score.


MDAgents: An Adaptive Collaboration of LLMs for Medical Decision-Making

Neural Information Processing Systems

Foundation models are becoming valuable tools in medicine. Yet despite their promise, the best way to leverage Large Language Models (LLMs) in complex medical tasks remains an open question. We introduce a novel multi-agent framework, named **M**edical **D**ecision-making **Agents** (**MDAgents**) that helps to address this gap by automatically assigning a collaboration structure to a team of LLMs. The assigned solo or group collaboration structure is tailored to the medical task at hand, a simple emulation inspired by the way real-world medical decision-making processes are adapted to tasks of different complexities. We evaluate our framework and baseline methods using state-of-the-art LLMs across a suite of real-world medical knowledge and clinical diagnosis benchmarks, including a comparison ofLLMs' medical complexity classification against human physicians.


A Demonstration of Adaptive Collaboration of Large Language Models for Medical Decision-Making

arXiv.org Artificial Intelligence

Medical Decision-Making (MDM) is a multi-faceted process that requires clinicians to assess complex multi-modal patient data patient, often collaboratively. Large Language Models (LLMs) promise to streamline this process by synthesizing vast medical knowledge and multi-modal health data. However, single-agent are often ill-suited for nuanced medical contexts requiring adaptable, collaborative problem-solving. Our MDAgents addresses this need by dynamically assigning collaboration structures to LLMs based on task complexity, mimicking real-world clinical collaboration and decision-making. This framework improves diagnostic accuracy and supports adaptive responses in complex, real-world medical scenarios, making it a valuable tool for clinicians in various healthcare settings, and at the same time, being more efficient in terms of computing cost than static multi-agent decision making methods.


Investigation of Customized Medical Decision Algorithms Utilizing Graph Neural Networks

arXiv.org Artificial Intelligence

Aiming at the limitations of traditional medical decision system in processing large-scale heterogeneous medical data and realizing highly personalized recommendation, this paper introduces a personalized medical decision algorithm utilizing graph neural network (GNN). This research innovatively integrates graph neural network technology into the medical and health field, aiming to build a high-precision representation model of patient health status by mining the complex association between patients' clinical characteristics, genetic information, living habits. In this study, medical data is preprocessed to transform it into a graph structure, where nodes represent different data entities (such as patients, diseases, genes, etc.) and edges represent interactions or relationships between entities. The core of the algorithm is to design a novel multi-scale fusion mechanism, combining the historical medical records, physiological indicators and genetic characteristics of patients, to dynamically adjust the attention allocation strategy of the graph neural network, so as to achieve highly customized analysis of individual cases. In the experimental part, this study selected several publicly available medical data sets for validation, and the results showed that compared with traditional machine learning methods and a single graph neural network model, the proposed personalized medical decision algorithm showed significantly superior performance in terms of disease prediction accuracy, treatment effect evaluation and patient risk stratification.


The top 5 technologies that will change health care over the next decade

#artificialintelligence

The past decade was about the rise of digital health technology and patient empowerment. The next decade will be about artificial intelligence, the use of health sensors and the so-called Internet of Healthy Things and how it could improve millions of lives. The cultural transformation of health care we call digital health has been changing the hierarchy in care into an equal-level partnership between patients and physicians as 21st century technologies have started breaking down the ivory tower of medicine. But these milestones are nothing compared with what is about to become reality. With advancements in exoskeleton technology, AI's ever-increasing importance in health care, and technologies like 5G and quantum computing soon going mainstream, there's much to be excited about.


On the ethics of algorithmic decision-making in healthcare

#artificialintelligence

Machine learning is increasingly being conceived as a technology with the potential to transform professional healthcare. Recently, there has been a surge of interest in machine learning for medical decision-making (reviewed by Esteva and Topol1 2), fuelled by a series of studies demonstrating'expert-level' accuracy of machine learning algorithms, for example, in diagnosing eye diseases from fundus images,3 and different types of skin cancer from images of skin lesions.4 Moreover, a study made by Walsh and colleagues found that machine learning algorithms managed to predict the risk of imminent suicide attempts at high accuracy based on a large repository of clinical electronic health data (Walsh, p. 460).5 In contrast, for clinicians, the ability to predict suicide attempts has been near chance for decades. Hence, machine learning algorithms promise to enhance the diagnostic as well as the predictive abilities of clinicians by assessing health risks of individual patients based on complex diagnostic data sets.


Google Research into Concept Vectors for Image Search

#artificialintelligence

Google recently released research about a tool called Similar Medical Images Like Yours (SMILY) that uses concept vectors to enhance searching for medical images. The research uses embeddings for image-based search and allows users to influence the search through the interactive refinement of concepts. Google released two papers in succession. The first paper, "Similar image search for histopathology: SMILY" focused on the deep neural network architecture that was used to create the embeddings necessary to find similar images. The second paper, "Human-Centered Tools for Coping with Imperfect Algorithms During Medical Decision-Making," focused on human interaction aspects necessary to improve the usage of the tool created in the first paper.