Goto

Collaborating Authors

 clinical application


Shape Deformation Networks for Automated Aortic Valve Finite Element Meshing from 3D CT Images

Qian, Linchen, Chen, Jiasong, Gong, Ruonan, Sun, Wei, Liu, Minliang, Liang, Liang

arXiv.org Artificial Intelligence

Accurate geometric modeling of the aortic valve from 3D CT images is essential for biomechanical analysis and patient-specific simulations to assess valve health or make a preoperative plan. However, it remains challenging to generate aortic valve meshes with both high-quality and consistency across different patients. Traditional approaches often produce triangular meshes with irregular topologies, which can result in poorly shaped elements and inconsistent correspondence due to inter-patient anatomical variation. In this work, we address these challenges by introducing a template-fitting pipeline with deep neural networks to generate structured quad (i.e., quadrilateral) meshes from 3D CT images to represent aortic valve geometries. By remeshing aortic valves of all patients with a common quad mesh template, we ensure a uniform mesh topology with consistent node-to-node and element-to-element correspondence across patients. This consistency enables us to simplify the learning objective of the deep neural networks, by employing a loss function with only two terms (i.e., a geometry reconstruction term and a smoothness regularization term), which is sufficient to preserve mesh smoothness and element quality. Our experiments demonstrate that the proposed approach produces high-quality aortic valve surface meshes with improved smoothness and shape quality, while requiring fewer explicit regularization terms compared to the traditional methods. These results highlight that using structured quad meshes for the template and neural network training not only ensures mesh correspondence and quality but also simplifies the training process, thus enhancing the effectiveness and efficiency of aortic valve modeling.


Large Connectome Model: An fMRI Foundation Model of Brain Connectomes Empowered by Brain-Environment Interaction in Multitask Learning Landscape

Wei, Ziquan, Dan, Tingting, Wu, Guorong

arXiv.org Artificial Intelligence

A reliable foundation model of functional neuroimages is critical to promote clinical applications where the performance of current AI models is significantly impeded by a limited sample size. To that end, tremendous efforts have been made to pretraining large models on extensive unlabeled fMRI data using scalable self-supervised learning. Since self-supervision is not necessarily aligned with the brain-to-outcome relationship, most foundation models are suboptimal to the downstream task, such as predicting disease outcomes. By capitalizing on rich environmental variables and demographic data along with an unprecedented amount of functional neuroimages, we form the brain modeling as a multitask learning and present a scalable model architecture for (i) multitask pretraining by tokenizing multiple brain-environment interactions (BEI) and (ii) semi-supervised finetuning by assigning pseudo-labels of pretrained BEI. We have evaluated our foundation model on a variety of applications, including sex prediction, human behavior recognition, and disease early diagnosis of Autism, Parkinson's disease, Alzheimer's disease, and {Schizophrenia}, where promising results indicate the great potential to facilitate current neuroimaging applications in clinical routines.


Taxonomy of Comprehensive Safety for Clinical Agents

Seo, Jean, Lee, Hyunkyung, Kim, Gibaeg, Han, Wooseok, Yoo, Jaehyo, Lim, Seungseop, Shin, Kihun, Yang, Eunho

arXiv.org Artificial Intelligence

Safety is a paramount concern in clinical chatbot applications, where inaccurate or harmful responses can lead to serious consequences. Existing methods--such as guardrails and tool calling--often fall short in addressing the nuanced demands of the clinical domain. In this paper, we introduce TACOS (TAxonomy of COmprehensive Safety for Clinical Agents), a fine-grained, 21-class taxonomy that integrates safety filtering and tool selection into a single user intent classification step. TACOS is a taxonomy that can cover a wide spectrum of clinical and non-clinical queries, explicitly modeling varying safety thresholds and external tool dependencies. To validate our taxonomy, we curate a TACOS-annotated dataset and perform extensive experiments. Our results demonstrate the value of a new taxonomy specialized for clinical agent settings, and reveal useful insights about train data distribution and pretrained knowledge of base models.


Enhancing Clinical Text Classification via Fine-Tuned DRAGON Longformer Models

Yang, Mingchuan, Huang, Ziyuan

arXiv.org Artificial Intelligence

This study explores the optimization of the DRAGON Longformer base model for clinical text classification, specifically targeting the binary classification of medical case descriptions. A dataset of 500 clinical cases containing structured medical observations was used, with 400 cases for training and 100 for validation. Enhancements to the pre - trained joeranbosma/dragon - longformer - base - mixed - domain model included hyperparameter tuning, domain - specific preprocessing, and architectural adjustments. Key modifications involved increasing sequence length from 512 to 1024 tokens, adjusting learning rates from 1e - 05 to 5e - 06, extending training epochs from 5 to 8, and incorporating specialized medical terminology. The optimized model achieved notable performance gains: accuracy improved from 72.0% to 85.2%, precision from 68.0% to 84.1%, recall from 75.0% to 86.3%, and F1 - score from 71.0% to 85.2%. Statistical analysis confirmed the significance of these improvements (p < .001). The model demonstrated enhanced capability in interpreting medical terminology, anatomical measurements, and clinical observations. These findings contribute to domain - specific language model research and offer practical implications for clinical natural language processing applications. The optimized model ' s strong performance across diverse medical conditions underscores its potential for broad use in healthcare settings. Enhancing Clinical Text Classification via Fine - Tuned DRAGON Longformer Models Introduction Natural language processing (NLP) in healthcare has continued to advance rapidly, revolutionizing the ability to analyze clinical texts and automate the extraction of valuable insights from massive amounts of medical documentation (Khurana, Koli, Khatter, & Singh, 2023). Over the past few years, large language models (LLMs) have emerged as powerful tools for gaining insight from and processing clinical narratives, creating capabilities that have never been seen before in medical text classification, entity recognition, and clinical decision support (Wang et al., 2018). The DRAGON (Deep Representation Analysis for General - domain Ontology Networks) framework was a specialized version of medical text processing out of all these models (Bosma et al., 2025). Beltagy, Peters, and Cohan (2020) state that the DRAGON longformer model, built on top of the Longformer architecture, addresses the quadratic computational complexity issue of traditional transformer models by processing long sequences.


A Scoping Review of Synthetic Data Generation for Biomedical Research and Applications

Rao, Hanshu, Liu, Weisi, Wang, Haohan, Huang, I-Chan, He, Zhe, Huang, Xiaolei

arXiv.org Artificial Intelligence

Synthetic data generation--mitigating data scarcity, privacy concerns, and data quality challenges in biomedical fields--has been facilitated by rapid advances of large language models (LLMs). This scoping review follows PRISMA-ScR guidelines and synthesizes 59 studies, published between 2020 and 2025 and collected from PubMed, ACM, Web of Science, and Google Scholar. The review systematically examines biomedical research and application trends in synthetic data generation, emphasizing clinical applications, methodologies, and evaluations. Our analysis identifies data modalities of unstructured texts (78.0%), tabular data (13.6%), and multimodal sources (8.4%); generation methods of prompting (72.9%), fine-tuning (22.0%) LLMs and specialized model (5.1%); and heterogeneous evaluations of intrinsic metrics (27.1%), human-in-the-loop assessments (55.9%), and LLM-based evaluations (13.6%). The analysis addresses current limitations in what, where, and how health professionals can leverage synthetic data generation for biomedical domains. Our review also highlights challenges in adaption across clinical domains, resource and model accessibility, and evaluation standardizations.


An Inclusive Foundation Model for Generalizable Cytogenetics in Precision Oncology

Yang, Changchun, Dai, Weiqian, Zhang, Yilan, Chen, Siyuan, Hu, Jingdong, Su, Junkai, Chen, Yuxuan, Xu, Ao, Li, Na, Gao, Xin, Yu, Yongguo

arXiv.org Artificial Intelligence

Chromosome analysis is vital for diagnosing genetic disorders and guiding cancer therapy decisions through the identification of somatic clonal aberrations. However, developing an AI model are hindered by the overwhelming complexity and diversity of chromosomal abnormalities, requiring extensive annotation efforts, while automated methods remain task-specific and lack generalizability due to the scarcity of comprehensive datasets spanning diverse resource conditions. Here, we introduce CHROMA, a foundation model for cytogenomics, designed to overcome these challenges by learning generalizable representations of chromosomal abnormalities. Pre -trained on over 84,000 specimens (~4 million chromosomal images) via self -supervised learning, CHROMA outperforms other methods across all types of abnormalities, even when trained on fewer labelled data and more imbalanced datasets. By facilitating comprehensive mapping of instability and clonal leisons across various aberration types, CHROMA offers a scalable and generalizable solution for reliable and automated clinical analysis, reducing the annotation workload for experts and advancing precision oncology through the early detection of rare genomic abnormalities, enabling broad clinical AI applications and making advanced genomic analysis more accessible.


MEDIC: Towards a Comprehensive Framework for Evaluating LLMs in Clinical Applications

Kanithi, Praveen K, Christophe, Clément, Pimentel, Marco AF, Raha, Tathagata, Saadi, Nada, Javed, Hamza, Maslenkova, Svetlana, Hayat, Nasir, Rajan, Ronnie, Khan, Shadab

arXiv.org Artificial Intelligence

The rapid development of Large Language Models (LLMs) for healthcare applications has spurred calls for holistic evaluation beyond frequently-cited benchmarks like USMLE, to better reflect real-world performance. While real-world assessments are valuable indicators of utility, they often lag behind the pace of LLM evolution, likely rendering findings obsolete upon deployment. This temporal disconnect necessitates a comprehensive upfront evaluation that can guide model selection for specific clinical applications. We introduce MEDIC, a framework assessing LLMs across five critical dimensions of clinical competence: medical reasoning, ethics and bias, data and language understanding, in-context learning, and clinical safety. MEDIC features a novel cross-examination framework quantifying LLM performance across areas like coverage and hallucination detection, without requiring reference outputs. We apply MEDIC to evaluate LLMs on medical question-answering, safety, summarization, note generation, and other tasks. Our results show performance disparities across model sizes, baseline vs medically finetuned models, and have implications on model selection for applications requiring specific model strengths, such as low hallucination or lower cost of inference. MEDIC's multifaceted evaluation reveals these performance trade-offs, bridging the gap between theoretical capabilities and practical implementation in healthcare settings, ensuring that the most promising models are identified and adapted for diverse healthcare applications.


Advancing oncology with federated learning: transcending boundaries in breast, lung, and prostate cancer. A systematic review

Ankolekar, Anshu, Boie, Sebastian, Abdollahyan, Maryam, Gadaleta, Emanuela, Hasheminasab, Seyed Alireza, Yang, Guang, Beauville, Charles, Dikaios, Nikolaos, Kastis, George Anthony, Bussmann, Michael, Khalid, Sara, Kruger, Hagen, Lambin, Philippe, Papanastasiou, Giorgos

arXiv.org Artificial Intelligence

Federated Learning (FL) has emerged as a promising solution to address the limitations of centralised machine learning (ML) in oncology, particularly in overcoming privacy concerns and harnessing the power of diverse, multi-center data. This systematic review synthesises current knowledge on the state-of-the-art FL in oncology, focusing on breast, lung, and prostate cancer. Distinct from previous surveys, our comprehensive review critically evaluates the real-world implementation and impact of FL on cancer care, demonstrating its effectiveness in enhancing ML generalisability, performance and data privacy in clinical settings and data. We evaluated state-of-the-art advances in FL, demonstrating its growing adoption amid tightening data privacy regulations. FL outperformed centralised ML in 15 out of the 25 studies reviewed, spanning diverse ML models and clinical applications, and facilitating integration of multi-modal information for precision medicine. Despite the current challenges identified in reproducibility, standardisation and methodology across studies, the demonstrable benefits of FL in harnessing real-world data and addressing clinical needs highlight its significant potential for advancing cancer research. We propose that future research should focus on addressing these limitations and investigating further advanced FL methods, to fully harness data diversity and realise the transformative power of cutting-edge FL in cancer care.


Hand tracking for clinical applications: validation of the Google MediaPipe Hand (GMH) and the depth-enhanced GMH-D frameworks

Amprimo, Gianluca, Masi, Giulia, Pettiti, Giuseppe, Olmo, Gabriella, Priano, Lorenzo, Ferraris, Claudia

arXiv.org Artificial Intelligence

Accurate 3D tracking of hand and fingers movements poses significant challenges in computer vision. The potential applications span across multiple domains, including human-computer interaction, virtual reality, industry, and medicine. While gesture recognition has achieved remarkable accuracy, quantifying fine movements remains a hurdle, particularly in clinical applications where the assessment of hand dysfunctions and rehabilitation training outcomes necessitate precise measurements. Several novel and lightweight frameworks based on Deep Learning have emerged to address this issue; however, their performance in accurately and reliably measuring fingers movements requires validation against well-established gold standard systems. In this paper, the aim is to validate the handtracking framework implemented by Google MediaPipe Hand (GMH) and an innovative enhanced version, GMH-D, that exploits the depth estimation of an RGB-Depth camera to achieve more accurate tracking of 3D movements. Three dynamic exercises commonly administered by clinicians to assess hand dysfunctions, namely Hand Opening-Closing, Single Finger Tapping and Multiple Finger Tapping are considered. Results demonstrate high temporal and spectral consistency of both frameworks with the gold standard. However, the enhanced GMH-D framework exhibits superior accuracy in spatial measurements compared to the baseline GMH, for both slow and fast movements. Overall, our study contributes to the advancement of hand tracking technology, the establishment of a validation procedure as a good-practice to prove efficacy of deep-learning-based hand-tracking, and proves the effectiveness of GMH-D as a reliable framework for assessing 3D hand movements in clinical applications.


Watch a film through the eyes of a MOUSE: Scientists use AI to reconstruct its brain signals

Daily Mail - Science & tech

Have you ever struggled to describe something to your friend that you watched on TV last night? Soon, you might be able to project your mental images onto the big screen, as scientists have been doing so with mice. A team from École Polytechnique Fédérale de Lausanne (EPFL) developed an artificial intelligence (AI) tool that can interpret the rodents' brain signals. The algorithm, named CEBRA, was trained to map neural activity to specific frames in videos, so it could then predict and reconstruct what a mouse is looking at. The news comes shortly after researchers at the University of Texas at Austin used AI to turn people's thoughts into text in real-time.