M3T: Multi-Modal Medical Transformer to bridge Clinical Context with Visual Insights for Retinal Image Medical Description Generation

Shaik, Nagur Shareef, Cherukuri, Teja Krishna, Ye, Dong Hye

Jun-18-2024–arXiv.org Artificial Intelligence

The scarcity of labeled data poses challenges in Automated retinal image medical description generation is both image classification and caption generation tasks in crucial for streamlining medical diagnosis and treatment medical image analysis. Researchers address this by employing planning. Existing challenges include the reliance on learned Transfer Learning, leveraging models pre-trained on retinal image representations, difficulties in handling multiple ImageNet for medical image tasks [7, 8]. Pre-training on imaging modalities, and the lack of clinical context natural images and fine-tuning on medical datasets enhances in visual representations. Addressing these issues, we propose feature learning, especially in medical image classification the Multi-Modal Medical Transformer (M3T), a novel [9]. Semi-supervised and self-supervised learning in medical deep learning architecture that integrates visual representations representation explores unlabeled data, benefiting subsequent with diagnostic keywords.

artificial intelligence, keyword, machine learning, (15 more...)

arXiv.org Artificial Intelligence

Jun-18-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.14)

Genre:
- Research Report (1.00)

Industry:
- Health & Medicine
  - Diagnostic Medicine > Imaging (1.00)
  - Therapeutic Area > Ophthalmology/Optometry (1.00)

Technology:
- Information Technology
  - Artificial Intelligence > Machine Learning
    - Neural Networks > Deep Learning (0.89)
  - Sensing and Signal Processing > Image Processing (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found