AITopics

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Indiana (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.93)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

Neural Information Processing SystemsFeb-11-2026, 09:05:45 GMT

d61e9e58ae1058322bc169943b39f1d8-Supplemental.pdf

lsp, pneumothorax, prediction, (16 more...)

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)

Genre: Research Report (0.67)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Neural Information Processing SystemsFeb-11-2026, 09:05:41 GMT

d61e9e58ae1058322bc169943b39f1d8-Paper.pdf

Setprediction tasksrequire thematching between predicted setandground truth set in order to propagate the gradient signal. Recent works have performed this matching in the original feature space thus requiring predefined distance functions.

artificial intelligence, etal, machine learning, (16 more...)

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > Michigan (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)

Industry: Health & Medicine > Therapeutic Area (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

arXiv.org Artificial IntelligenceDec-1-2025

Closing the Performance Gap Between AI and Radiologists in Chest X-Ray Reporting

Sharma, Harshita, Reynolds, Maxwell C., Salvatelli, Valentina, Sykes, Anne-Marie G., Horst, Kelly K., Schwaighofer, Anton, Ilse, Maximilian, Melnichenko, Olesya, Bond-Taylor, Sam, Pérez-García, Fernando, Mugu, Vamshi K., Chan, Alex, Colak, Ceylan, Swartz, Shelby A., Nashawaty, Motassem B., Gonzalez, Austin J., Ouellette, Heather A., Erdal, Selnur B., Schueler, Beth A., Wetscherek, Maria T., Codella, Noel, Jain, Mohit, Bannur, Shruthi, Bouzid, Kenza, Castro, Daniel C., Hyland, Stephanie, Korfiatis, Panos, Khandelwal, Ashish, Alvarez-Valle, Javier

AI-assisted report generation offers the opportunity to reduce radiologists' workload stemming from expanded screening guidelines, complex cases and workforce shortages, while maintaining diagnostic accuracy. In addition to describing pathological findings in chest X-ray reports, interpreting lines and tubes (L&T) is demanding and repetitive for radiologists, especially with high patient volumes. We introduce MAIRA-X, a clinically evaluated multimodal AI model for longitudinal chest X-ray (CXR) report generation, that encompasses both clinical findings and L&T reporting. Developed using a large-scale, multi-site, longitudinal dataset of 3.1 million studies (comprising 6 million images from 806k patients) from Mayo Clinic, MAIRA-X was evaluated on three holdout datasets and the public MIMIC-CXR dataset, where it significantly improved AI-generated reports over the state of the art on lexical quality, clinical correctness, and L&T-related elements. A novel L&T-specific metrics framework was developed to assess accuracy in reporting attributes such as type, longitudinal change and placement. A first-of-its-kind retrospective user evaluation study was conducted with nine radiologists of varying experience, who blindly reviewed 600 studies from distinct subjects. The user study found comparable rates of critical errors (3.0% for original vs. 4.6% for AI-generated reports) and a similar rate of acceptable sentences (97.8% for original vs. 97.4% for AI-generated reports), marking a significant improvement over prior user studies with larger gaps and higher error rates. Our results suggest that MAIRA-X can effectively assist radiologists, particularly in high-volume clinical settings.

large language model, machine learning, natural language, (22 more...)

2511.21735

Country: North America > United States (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Sensing and Signal Processing > Image Processing (0.92)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.67)

Neural Information Processing SystemsNov-20-2025, 23:42:35 GMT

Hybrid Retrieval-Generation Reinforced Agent for Medical Image Report Generation

Yuan Li, Xiaodan Liang, Zhiting Hu, Eric P. Xing

HRGR-Agent employs a hierarchical decision-making procedure.

artificial intelligence, machine learning, natural language, (16 more...)

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Indiana (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.93)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)
Information Technology > Sensing and Signal Processing > Image Processing (0.66)

arXiv.org Artificial IntelligenceNov-14-2025

Feature Quality and Adaptability of Medical Foundation Models: A Comparative Evaluation for Radiographic Classification and Segmentation

Li, Frank, Dapamede, Theo, Chavoshi, Mohammadreza, Jeon, Young Seok, Khosravi, Bardia, Dere, Abdulhameed, Brown-Mulry, Beatrice, Isaac, Rohan Satya, Mansuri, Aawez, Sanyika, Chiratidzo, Newsome, Janice, Purkayastha, Saptarshi, Banerjee, Imon, Trivedi, Hari, Gichoya, Judy

Foundation models (FMs) promise to generalize medical imaging, but their effectiveness varies. It remains unclear how pre-training domain (medical vs. general), paradigm (e.g., text-guided), and architecture influence embedding quality, hindering the selection of optimal encoders for specific radiology tasks. To address this, we evaluate vision encoders from eight medical and general-domain FMs for chest X-ray analysis. We benchmark classification (pneumothorax, cardiomegaly) and segmentation (pneumothorax, cardiac boundary) using linear probing and fine-tuning. Our results show that domain-specific pre-training provides a significant advantage; medical FMs consistently outperformed general-domain models in linear probing, establishing superior initial feature quality. However, feature utility is highly task-dependent. Pre-trained embeddings were strong for global classification and segmenting salient anatomy (e.g., heart). In contrast, for segmenting complex, subtle pathologies (e.g., pneumothorax), all FMs performed poorly without significant fine-tuning, revealing a critical gap in localizing subtle disease. Subgroup analysis showed FMs use confounding shortcuts (e.g., chest tubes for pneumothorax) for classification, a strategy that fails for precise segmentation. We also found that expensive text-image alignment is not a prerequisite; image-only (RAD-DINO) and label-supervised (Ark+) FMs were among top performers. Notably, a supervised, end-to-end baseline remained highly competitive, matching or exceeding the best FMs on segmentation tasks. These findings show that while medical pre-training is beneficial, architectural choices (e.g., multi-scale) are critical, and pre-trained features are not universally effective, especially for complex localization tasks where supervised models remain a strong alternative.

artificial intelligence, machine learning, natural language, (19 more...)

2511.09742

Country: North America > United States (0.93)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Roque, Alvaro Aranibar, Sebastian, Helga

Chest X-ray Pneumothorax Segmentation Using EfficientNet-B4 Transfer Learning in a U-Net Architecture

arXiv.org Artificial IntelligenceSep-5-2025

Ab s tract -- Pneumothorax, the abnormal accumulation of air in the pleural space, can be life - threatening if undetected. Chest X - rays are the first - line diagnostic tool, but small cases may be subtle. We propose an automated deep - learning pipeline using a U - Net with an EfficientNet - B4 encoder to segment pneumothorax regions. Trained on the SIIM - ACR dataset with data augmentation and a combined binary cross - entropy plus Dice loss, the model achieved an IoU of 0.7008 a nd Dice score of 0.8241 on the independent PTX - 498 dataset. These results demonstrate that the model can accurately localize pneumothoraces and support radiologists . Pneumothorax is the abnormal accumulation of air in the pleural space, which can arise spontaneously or due to trauma or medical procedures. Early detection is critical, as even small pneumothoraces may rapidly progress to life - threatening conditions. Clin ical examination alone may miss subtle cases [1], making chest X - rays the standard diagnostic tool.

artificial intelligence, deep learning, machine learning, (18 more...)

2509.0395

Country: Europe > Germany (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Nuclear Medicine (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Neural Information Processing SystemsAug-17-2025, 16:06:33 GMT

Set Prediction in the Latent Space

Recent works have performed this matching in the original feature space thus requiring predefined distance functions.

machine learning, natural language, prediction, (20 more...)

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)

Genre: Research Report (0.67)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Bouzid, Kenza, Bannur, Shruthi, Meissen, Felix, de Castro, Daniel Coelho, Schwaighofer, Anton, Alvarez-Valle, Javier, Hyland, Stephanie L.

Insights into a radiology-specialised multimodal large language model with sparse autoencoders

arXiv.org Artificial IntelligenceJul-21-2025

Interpretability can improve the safety, transparency and trust of AI models, which is especially important in healthcare applications where decisions often carry significant consequences. Mechanistic interpretability, particularly through the use of sparse autoencoders (SAEs), offers a promising approach for uncovering human-interpretable features within large transformer-based models. In this study, we apply Matryoshka-SAE to the radiology-specialised multimodal large language model, MAIRA-2, to interpret its internal representations. Using large-scale automated interpretability of the SAE features, we identify a range of clinically relevant concepts - including medical devices (e.g., line and tube placements, pacemaker presence), pathologies such as pleural effusion and cardiomegaly, longitudinal changes and textual features. We further examine the influence of these features on model behaviour through steering, demonstrating directional control over generations with mixed success. Our results reveal practical and methodological challenges, yet they offer initial insights into the internal concepts learned by MAIRA-2 - marking a step toward deeper mechanistic understanding and interpretability of a radiology-adapted multimodal large language model, and paving the way for improved model transparency. We release the trained SAEs and interpretations: https://huggingface.co/microsoft/maira-2-sae.

large language model, machine learning, natural language, (16 more...)

2507.1295

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Jahangir, Md. Zihad Bin, Kabir, Muhammad Ashad, Akter, Sumaiya, Jahan, Israt, Chau, Minh

LLaMA-XR: A Novel Framework for Radiology Report Generation using LLaMA and QLoRA Fine Tuning

arXiv.org Artificial IntelligenceJun-5-2025

Automated radiology report generation holds significant potential to reduce radiologists' workload and enhance diagnostic accuracy. However, generating precise and clinically meaningful reports from chest radiographs remains challenging due to the complexity of medical language and the need for contextual understanding. Existing models often struggle with maintaining both accuracy and contextual relevance. In this paper, we present LLaMA-XR, a novel framework that integrates LLaMA 3.1 with DenseNet-121-based image embeddings and Quantized Low-Rank Adaptation (QLoRA) fine-tuning. LLaMA-XR achieves improved coherence and clinical accuracy while maintaining computational efficiency. This efficiency is driven by an optimization strategy that enhances parameter utilization and reduces memory overhead, enabling faster report generation with lower computational resource demands. Extensive experiments conducted on the IU X-ray benchmark dataset demonstrate that LLaMA-XR outperforms a range of state-of-the-art methods. Our model achieves a ROUGE-L score of 0.433 and a METEOR score of 0.336, establishing new performance benchmarks in the domain. These results underscore LLaMA-XR's potential as an effective and efficient AI system for automated radiology reporting, offering enhanced clinical utility and reliability.

large language model, machine learning, natural language, (20 more...)