AITopics | medical image report generation

Hybrid Retrieval-Generation Reinforced Agent for Medical Image Report Generation

Neural Information Processing SystemsNov-20-2025, 23:07:38 GMT

Generating long and coherent reports to describe medical images poses challenges to bridging visual patterns with informative human linguistic descriptions. We propose a novel Hybrid Retrieval-Generation Reinforced Agent (HRGR-Agent) which reconciles traditional retrieval-based approaches populated with human prior knowledge, with modern learning-based approaches to achieve structured, robust, and diverse report generation. HRGR-Agent employs a hierarchical decision-making procedure. For each sentence, a high-level retrieval policy module chooses to either retrieve a template sentence from an off-the-shelf template database, or invoke a low-level generation module to generate a new sentence. HRGR-Agent is updated via reinforcement learning, guided by sentence-level and word-level rewards. Experiments show that our approach achieves the state-of-the-art results on two medical report datasets, generating well-balanced structured sentences with robust coverage of heterogeneous medical report contents. In addition, our model achieves the highest detection precision of medical abnormality terminologies, and improved human evaluation performance.

hybrid retrieval-generation reinforced agent, medical image report generation, name change, (2 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.78)
Information Technology > Sensing and Signal Processing > Image Processing (0.67)

Add feedback

Interview with Flávia Carvalhido: Responsible multimodal AI

AIHubAug-12-2025, 08:50:33 GMT

In this interview series, we're meeting some of the AAAI/SIGAI Doctoral Consortium participants to find out more about their research. In this latest interview, we hear from Flávia Carvalhido who is a PhD student at the University of Porto. We find out about her work on responsible multimodal AI, what inspired her to study AI, and how she found the Doctoral Consortium experience. My PhD programme is on Informatics Engineering in the Faculty of Engineering at the University of Porto, where I also got both my Bachelor's and Master's in the same field. My thesis research project is focused on responsible multimodal AI, titled "Stress Testing of Image-Text Multimodal Models in Medical Image Report Generation", supervised by Professor Henrique Lopes Cardoso and Professor Vítor Cerqueira and developed in the LIACC research laboratory.

carvalhido, multimodal ai, responsible multimodal ai, (10 more...)

AIHub

Country: Europe (0.30)

Genre: Research Report (0.32)

Industry: Health & Medicine (0.39)

Technology: Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.36)

Add feedback

ViT3D Alignment of LLaMA3: 3D Medical Image Report Generation

Li, Siyou, Xu, Beining, Luo, Yihao, Nie, Dong, Zhang, Le

arXiv.org Artificial IntelligenceOct-11-2024

Automatic medical report generation (MRG), which aims to produce detailed text reports from medical images, has emerged as a critical task in this domain. MRG systems can enhance radiological workflows by reducing the time and effort required for report writing, thereby improving diagnostic efficiency. In this work, we present a novel approach for automatic MRG utilizing a multimodal large language model. Specifically, we employed the 3D Vision Transformer (ViT3D) image encoder introduced from M3D-CLIP to process 3D scans and use the Asclepius-Llama3-8B as the language model to generate the text reports by auto-regressive decoding. The experiment shows our model achieved an average Green score of 0.3 on the MRG task validation set and an average accuracy of 0.61 on the visual question answering (VQA) task validation set, outperforming the baseline model. Our approach demonstrates the effectiveness of the ViT3D alignment of LLaMA3 for automatic MRG and VQA tasks by tuning the model on a small dataset.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.08588

Country:

Europe > United Kingdom > England > Greater London > London (0.05)
Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.05)
Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.05)
(2 more...)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Hybrid Retrieval-Generation Reinforced Agent for Medical Image Report Generation

Neural Information Processing SystemsOct-8-2024, 19:48:04 GMT

Generating long and coherent reports to describe medical images poses challenges to bridging visual patterns with informative human linguistic descriptions. We propose a novel Hybrid Retrieval-Generation Reinforced Agent (HRGR-Agent) which reconciles traditional retrieval-based approaches populated with human prior knowledge, with modern learning-based approaches to achieve structured, robust, and diverse report generation. HRGR-Agent employs a hierarchical decision-making procedure. For each sentence, a high-level retrieval policy module chooses to either retrieve a template sentence from an off-the-shelf template database, or invoke a low-level generation module to generate a new sentence. HRGR-Agent is updated via reinforcement learning, guided by sentence-level and word-level rewards.

hrgr-agent, hybrid retrieval-generation reinforced agent, medical image report generation

Neural Information Processing Systems

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.65)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.84)
Information Technology > Sensing and Signal Processing > Image Processing (0.65)

Add feedback

Reviews: Hybrid Retrieval-Generation Reinforced Agent for Medical Image Report Generation

Neural Information Processing SystemsOct-8-2024, 07:12:51 GMT

Summary The paper presents an approach using hierarchical reinforcement learning to address the problem of automatically generating medical reports using diagnostics images. The approach first predicts a sequence of hidden states for each sentence, and deicdes when to stop, and a low level model takes the hidden state and either retrieves a sentence and uses it as an output or passes control to a generator which generates a sentence. The overall system is trained with rewards at both sentence level as well as word-level for generation. The proposed approach shows promise over ablations of the proposed model as well as some sensible baseline CNN-RNN based approaches for image captioning. Strengths Paper provides the experimental details of the setup quite thoroughly. Paper clearly mentions the hyperparameters used for training.

evaluation, hybrid retrieval-generation reinforced agent, medical image report generation, (6 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.40)

Technology:

Information Technology > Artificial Intelligence (0.92)
Information Technology > Sensing and Signal Processing > Image Processing (0.40)

Add feedback

Hybrid Retrieval-Generation Reinforced Agent for Medical Image Report Generation

Li, Yuan, Liang, Xiaodan, Hu, Zhiting, Xing, Eric P.

Neural Information Processing SystemsFeb-14-2020, 08:13:47 GMT

Generating long and coherent reports to describe medical images poses challenges to bridging visual patterns with informative human linguistic descriptions. We propose a novel Hybrid Retrieval-Generation Reinforced Agent (HRGR-Agent) which reconciles traditional retrieval-based approaches populated with human prior knowledge, with modern learning-based approaches to achieve structured, robust, and diverse report generation. HRGR-Agent employs a hierarchical decision-making procedure. For each sentence, a high-level retrieval policy module chooses to either retrieve a template sentence from an off-the-shelf template database, or invoke a low-level generation module to generate a new sentence. HRGR-Agent is updated via reinforcement learning, guided by sentence-level and word-level rewards.

artificial intelligence, hybrid retrieval-generation reinforced agent, machine learning, (2 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.65)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.89)
Information Technology > Sensing and Signal Processing > Image Processing (0.65)

Add feedback

Collaborating Authors

medical image report generation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Hybrid Retrieval-Generation Reinforced Agent for Medical Image Report Generation

Interview with Flávia Carvalhido: Responsible multimodal AI

ViT3D Alignment of LLaMA3: 3D Medical Image Report Generation

Hybrid Retrieval-Generation Reinforced Agent for Medical Image Report Generation

Reviews: Hybrid Retrieval-Generation Reinforced Agent for Medical Image Report Generation

Hybrid Retrieval-Generation Reinforced Agent for Medical Image Report Generation