AITopics | medsam

Collaborating Authors

medsam

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

RAM-W600: AMulti-Task Wrist Dataset and Benchmark for Rheumatoid Arthritis

Neural Information Processing SystemsJun-19-2026, 20:16:28 GMT

Rheumatoid arthritis (RA) is a common autoimmune disease that has been the focus of research in computer-aided diagnosis (CAD) and disease monitoring. In clinical settings, conventional radiography (CR) is widely used for the screening and evaluation of RA due to its low cost and accessibility. The wrist is a critical region for the diagnosis of RA. However, CAD research in this area remains limited, primarily due to the challenges in acquiring high-quality instance-level annotations.

artificial intelligence, machine learning, segmentation, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (1.00)
Europe (0.67)
Asia > Japan > Honshū > Kantō (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry:

Health & Medicine > Therapeutic Area > Rheumatology (1.00)
Health & Medicine > Therapeutic Area > Musculoskeletal (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Not Quite Anything: Overcoming SAMs Limitations for 3D Medical Imaging

Moore, Keith

arXiv.org Artificial IntelligenceNov-26-2025

Foundation segmentation models such as SAM and SAM-2 perform well on natural images but struggle with brain MRIs where structures like the caudate and thalamus lack sharp boundaries and have low contrast. Rather than fine tune these models (for example MedSAM), we propose a compositional alternative where the foundation model output is treated as an additional input channel and passed alongside the MRI to highlight regions of interest. We generate SAM-2 prompts by using a lightweight 3D U-Net that was previously trained on MRI segmentation. The U-Net may have been trained on a different dataset, so its guesses are often imprecise but usually in the correct region. The edges of the resulting foundation model guesses are smoothed to improve alignment with the MRI. We also test prompt free segmentation using DINO attention maps in the same framework. This has-a architecture avoids modifying foundation weights and adapts to domain shift without retraining the foundation model. It reaches about 96 percent volume accuracy on basal ganglia segmentation, which is sufficient for our study of longitudinal volume change. The approach is fast, label efficient, and robust to out of distribution scans. We apply it to study inflammation linked changes in sudden onset pediatric OCD.

artificial intelligence, machine learning, segmentation, (16 more...)

arXiv.org Artificial Intelligence

2511.19471

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.70)
Information Technology > Artificial Intelligence > Vision (0.64)
Information Technology > Artificial Intelligence > Cognitive Science > Neuroscience (0.35)

Add feedback

Segment Anything in Pathology Images with Natural Language

Chen, Zhixuan, Hou, Junlin, Lin, Liqi, Wang, Yihui, Bie, Yequan, Wang, Xi, Zhou, Yanning, Chan, Ronald Cheong Kin, Chen, Hao

arXiv.org Artificial IntelligenceAug-20-2025

However, current segmentation methods encounter significant challenges in clinical applications, primarily due to the scarcity of high-quality, large-scale annotated pathology data and the constraints of fixed, narrowly defined object categories. To address these issues, this work aims to develop a segmentation foundation model capable of segmenting anything in pathology images using natural language. First, we establish PathSeg, the largest and most comprehensive dataset for pathology image semantic segmentation, derived from 21 publicly available datasets and comprising 275k image-mask-label triples. Our PathSeg dataset features a wide variety of 160 segmentation categories organized in a three-level hierarchy that covers 20 anatomical regions, 3 histological structures, and 61 object types. Next, we introduce PathSegmentor, a text-prompted foundation model tailored for pathology image segmentation. With PathSegmentor, users can achieve semantic segmentation simply by providing a descriptive text prompt for the target category, thus eliminating the need to laboriously provide numerous spatial prompts like boxes or points for each instance. Extensive experiments on both internal and external datasets demonstrate the superior segmentation performance of PathSegmentor. It outperforms the group of specialized models, effectively handling a broader range of segmentation categories while maintaining a more compact model size.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2506.20988

Country: Asia > China (0.46)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.69)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

MRI-CORE: A Foundation Model for Magnetic Resonance Imaging

Dong, Haoyu, Chen, Yuwen, Gu, Hanxue, Konz, Nicholas, Chen, Yaqian, Li, Qihang, Mazurowski, Maciej A.

arXiv.org Artificial IntelligenceJul-24-2025

The widespread use of Magnetic Resonance Imaging (MRI) in combination with deep learning shows promise for many high-impact automated diagnostic and prognostic tools. However, training new models requires large amounts of labeled data, a challenge due to high cost of precise annotations and data privacy. To address this issue, we introduce the MRI-CORE, a vision foundation model trained using more than 6 million slices from over 110 thousand MRI volumes across 18 body locations. Our experiments show notable improvements in performance over state-of-the-art methods in 13 data-restricted segmentation tasks, as well as in image classification, and zero-shot segmentation, showing the strong potential of MRI-CORE to enable data-efficient development of artificial intelligence models. We also present data on which strategies yield most useful foundation models and a novel analysis relating similarity between pre-training and downstream task data with transfer learning performance. Our model is publicly available with a permissive license. Magnetic Resonance Imaging (MRI) is one of the most widely used imaging modalities in medical diagnostics, with around 100-150 million scans performed annually worldwide (Papanicolas et al. 2018). MRI supports a wide range of clinical tasks, including lesion detection, tissue classification, and disease monitoring. Among these tasks, segmentation plays a particularly important role, as it enables precise delineation of anatomical structures and pathological regions, directly impacting diagnosis, treatment planning, and longitudinal studies (Mazurowski et al. 2023; Ma et al. 2024; Azad et al. 2024; Xu et al. 2024). Recent advances in deep learning have significantly improved the automation and accuracy of MRI-based analyses across a variety of tasks. However, deep learning-based methods typically require large amounts of manually annotated data and lack task transferability, making them difficult to scale across new tasks, anatomies, or patient populations.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2506.12186

Country: North America > United States (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Test-time Adaptation for Foundation Medical Segmentation Model without Parametric Updates

Chen, Kecheng, Luo, Xinyu, Qin, Tiexin, Liu, Jie, Liu, Hui, Lee, Victor Ho Fun, Yan, Hong, Li, Haoliang

arXiv.org Artificial IntelligenceJul-16-2025

Foundation medical segmentation models, with MedSAM being the most popular, have achieved promising performance across organs and lesions. However, MedSAM still suffers from compromised performance on specific lesions with intricate structures and appearance, as well as bounding box prompt-induced perturbations. Although current test-time adaptation (TTA) methods for medical image segmentation may tackle this issue, partial (e.g., batch normalization) or whole parametric updates restrict their effectiveness due to limited update signals or catastrophic forgetting in large models. Meanwhile, these approaches ignore the computational complexity during adaptation, which is particularly significant for modern foundation models. To this end, our theoretical analyses reveal that directly refining image embeddings is feasible to approach the same goal as parametric updates under the MedSAM architecture, which enables us to realize high computational efficiency and segmentation performance without the risk of catastrophic forgetting. Under this framework, we propose to encourage maximizing factorized conditional probabilities of the posterior prediction probability using a proposed distribution-approximated latent conditional random field loss combined with an entropy minimization loss. Experiments show that we achieve about 3\% Dice score improvements across three datasets while reducing computational complexity by over 7 times.

machine learning, natural language, segmentation, (16 more...)

arXiv.org Artificial Intelligence

2504.02008

Genre: Research Report (0.50)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.70)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)
(2 more...)

Add feedback

Reinforced Correlation Between Vision and Language for Precise Medical AI Assistant

Wang, Haonan, Mao, Jiaji, Wang, Lehan, Zhang, Qixiang, Elbatel, Marawan, Qin, Yi, Hu, Huijun, Li, Baoxun, Deng, Wenhui, Qin, Weifeng, Li, Hongrui, Liang, Jialin, Shen, Jun, Li, Xiaomeng

arXiv.org Artificial IntelligenceMay-7-2025

Medical AI assistants support doctors in disease diagnosis, medical image analysis, and report generation. However, they still face significant challenges in clinical use, including limited accuracy with multimodal content and insufficient validation in real-world settings. We propose RCMed, a full-stack AI assistant that improves multimodal alignment in both input and output, enabling precise anatomical delineation, accurate localization, and reliable diagnosis through hierarchical vision-language grounding. A self-reinforcing correlation mechanism allows visual features to inform language context, while language semantics guide pixel-wise attention, forming a closed loop that refines both modalities. This correlation is enhanced by a color region description strategy, translating anatomical structures into semantically rich text to learn shape-location-text relationships across scales. Trained on 20 million image-mask-description triplets, RCMed achieves state-of-the-art precision in contextualizing irregular lesions and subtle anatomical boundaries, excelling in 165 clinical tasks across 9 modalities. It achieved a 23.5% relative improvement in cell segmentation from microscopy images over prior methods. RCMed's strong vision-language alignment enables exceptional generalization, with state-of-the-art performance in external validation across 20 clinically significant cancer types, including novel tasks. This work demonstrates how integrated multimodal models capture fine-grained patterns, enabling human-level interpretation in complex scenarios and advancing human-centric AI healthcare.

artificial intelligence, machine learning, segmentation, (20 more...)

arXiv.org Artificial Intelligence

2505.0338

Country: Asia > China (0.14)

Genre: Research Report > Experimental Study (0.87)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.91)

Add feedback

RadSAM: Segmenting 3D radiological images with a 2D promptable model

Khlaut, Julien, Ferreres, Elodie, Tordjman, Daniel, Philippe, Hélène, Boeken, Tom, Manceron, Pierre, Dancette, Corentin

arXiv.org Artificial IntelligenceApr-30-2025

Medical image segmentation is a crucial and time-consuming task in clinical care, where mask precision is extremely important. The Segment Anything Model (SAM) offers a promising approach, as it provides an interactive interface based on visual prompting and edition to refine an initial segmentation. This model has strong generalization capabilities, does not rely on predefined classes, and adapts to diverse objects; however, it is pre-trained on natural images and lacks the ability to process medical data effectively. In addition, this model is built for 2D images, whereas a whole medical domain is based on 3D images, such as CT and MRI. Recent adaptations of SAM for medical imaging are based on 2D models, thus requiring one prompt per slice to segment 3D objects, making the segmentation process tedious. They also lack important features such as editing. To bridge this gap, we propose RadSAM, a novel method for segmenting 3D objects with a 2D model from a single prompt. In practice, we train a 2D model using noisy masks as initial prompts, in addition to bounding boxes and points. We then use this novel prompt type with an iterative inference pipeline to reconstruct the 3D mask slice-by-slice. We introduce a benchmark to evaluate the model's ability to segment 3D objects in CT images from a single prompt and evaluate the models' out-of-domain transfer and edition capabilities. We demonstrate the effectiveness of our approach against state-of-the-art models on this benchmark using the AMOS abdominal organ segmentation dataset.

artificial intelligence, machine learning, segmentation, (15 more...)

arXiv.org Artificial Intelligence

2504.20837

Country: Europe (0.46)

Genre: Research Report > Promising Solution (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Organ-aware Multi-scale Medical Image Segmentation Using Text Prompt Engineering

Zhang, Wenjie, Zhang, Ziyang, He, Mengnan, Ye, Jiancheng

arXiv.org Artificial IntelligenceMar-17-2025

Accurate segmentation is essential for effective treatment planning and disease monitoring. Existing medical image segmentation methods predominantly rely on uni-modal visual inputs, such as images or videos, requiring labor-intensive manual annotations. Additionally, medical imaging techniques capture multiple intertwined organs within a single scan, further complicating segmentation accuracy. To address these challenges, MedSAM, a large-scale medical segmentation model based on the Segment Anything Model (SAM), was developed to enhance segmentation accuracy by integrating image features with user-provided prompts. While MedSAM has demonstrated strong performance across various medical segmentation tasks, it primarily relies on geometric prompts (e.g., points and bounding boxes) and lacks support for text-based prompts, which could help specify subtle or ambiguous anatomical structures. To overcome these limitations, we propose the Organ-aware Multi-scale Text-guided Medical Image Segmentation Model (OMT-SAM) for multi-organ segmentation. Our approach introduces CLIP encoders as a novel image-text prompt encoder, operating with the geometric prompt encoder to provide informative contextual guidance. We pair descriptive textual prompts with corresponding images, processing them through pre-trained CLIP encoders and a cross-attention mechanism to generate fused image-text embeddings. Additionally, we extract multi-scale visual features from MedSAM, capturing fine-grained anatomical details at different levels of granularity. We evaluate OMT-SAM on the FLARE 2021 dataset, benchmarking its performance against existing segmentation methods. Empirical results demonstrate that OMT-SAM achieves a mean Dice Similarity Coefficient of 0.937, outperforming MedSAM (0.893) and other segmentation models, highlighting its superior capability in handling complex medical image segmentation tasks.

artificial intelligence, machine learning, segmentation, (15 more...)

arXiv.org Artificial Intelligence

2503.13806

Country:

North America > United States > New York (0.04)
Europe > Switzerland (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report > New Finding (0.35)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Cinepro: Robust Training of Foundation Models for Cancer Detection in Prostate Ultrasound Cineloops

Harmanani, Mohamed, Jamzad, Amoon, To, Minh Nguyen Nhat, Wilson, Paul F. R., Guo, Zhuoxin, Fooladgar, Fahimeh, Sojoudi, Samira, Gilany, Mahdi, Chang, Silvia, Black, Peter, Leveridge, Michael, Siemens, Robert, Abolmaesumi, Purang, Mousavi, Parvin

arXiv.org Artificial IntelligenceJan-21-2025

Prostate cancer (PCa) detection using deep learning (DL) models has shown potential for enhancing real-time guidance during biopsies. However, prostate ultrasound images lack pixel-level cancer annotations, introducing label noise. Current approaches often focus on limited regions of interest (ROIs), disregarding anatomical context necessary for accurate diagnosis. Foundation models can overcome this limitation by analyzing entire images to capture global spatial relationships; however, they still encounter challenges stemming from the weak labels associated with coarse pathology annotations in ultrasound data. We introduce Cinepro, a novel framework that strengthens foundation models' ability to localize PCa in ultrasound cineloops. Cinepro adapts robust training by integrating the proportion of cancer tissue reported by pathology in a biopsy core into its loss function to address label noise, providing a more nuanced supervision. Additionally, it leverages temporal data across multiple frames to apply robust augmentations, enhancing the model's ability to learn stable cancer-related features. Cinepro demonstrates superior performance on a multi-center prostate ultrasound dataset, achieving an AUROC of 77.1% and a balanced accuracy of 83.8%, surpassing current benchmarks. These findings underscore Cinepro's promise in advancing foundation models for weakly labeled ultrasound data.

artificial intelligence, deep learning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2501.12331

Country:

North America > United States (0.05)
North America > Canada > Ontario > Kingston (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.05)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report (0.84)

Industry:

Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Oncology > Prostate Cancer (0.37)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

ScarNet: A Novel Foundation Model for Automated Myocardial Scar Quantification from LGE in Cardiac MRI

Tavakoli, Neda, Rahsepar, Amir Ali, Benefield, Brandon C., Shen, Daming, López-Tapia, Santiago, Schiffers, Florian, Goldberger, Jeffrey J., Albert, Christine M., Wu, Edwin, Katsaggelos, Aggelos K., Lee, Daniel C., Kim, Daniel

arXiv.org Artificial IntelligenceJan-2-2025

Background: Late Gadolinium Enhancement (LGE) imaging is the gold standard for assessing myocardial fibrosis and scarring, with left ventricular (LV) LGE extent predicting major adverse cardiac events (MACE). Despite its importance, routine LGE-based LV scar quantification is hindered by labor-intensive manual segmentation and inter-observer variability. Methods: We propose ScarNet, a hybrid model combining a transformer-based encoder from the Medical Segment Anything Model (MedSAM) with a convolution-based U-Net decoder, enhanced by tailored attention blocks. ScarNet was trained on 552 ischemic cardiomyopathy patients with expert segmentations of myocardial and scar boundaries and tested on 184 separate patients. Results: ScarNet achieved robust scar segmentation in 184 test patients, yielding a median Dice score of 0.912 (IQR: 0.863--0.944), significantly outperforming MedSAM (median Dice = 0.046, IQR: 0.043--0.047) and nnU-Net (median Dice = 0.638, IQR: 0.604--0.661). ScarNet demonstrated lower bias (-0.63%) and coefficient of variation (4.3%) compared to MedSAM (bias: -13.31%, CoV: 130.3%) and nnU-Net (bias: -2.46%, CoV: 20.3%). In Monte Carlo simulations with noise perturbations, ScarNet achieved significantly higher scar Dice (0.892 \pm 0.053, CoV = 5.9%) than MedSAM (0.048 \pm 0.112, CoV = 233.3%) and nnU-Net (0.615 \pm 0.537, CoV = 28.7%). Conclusion: ScarNet outperformed MedSAM and nnU-Net in accurately segmenting myocardial and scar boundaries in LGE images. The model exhibited robust performance across diverse image qualities and scar patterns.

scar segmentation, scarnet, segmentation, (16 more...)

arXiv.org Artificial Intelligence

2501.01372

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Illinois > Cook County > Evanston (0.04)
(6 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)

Add feedback