AITopics | Vasa, Vamsi Krishna

Collaborating Authors

Vasa, Vamsi Krishna

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

RetinalGPT: A Retinal Clinical Preference Conversational Assistant Powered by Large Vision-Language Models

Zhu, Wenhui, Li, Xin, Chen, Xiwen, Qiu, Peijie, Vasa, Vamsi Krishna, Dong, Xuanzhao, Chen, Yanxi, Lepore, Natasha, Dumitrascu, Oana, Su, Yi, Wang, Yalin

arXiv.org Artificial IntelligenceMar-5-2025

Recently, Multimodal Large Language Models (MLLMs) have gained significant attention for their remarkable ability to process and analyze non-textual data, such as images, videos, and audio. Notably, several adaptations of general-domain MLLMs to the medical field have been explored, including LLaVA-Med. However, these medical adaptations remain insufficiently advanced in understanding and interpreting retinal images. In contrast, medical experts emphasize the importance of quantitative analyses for disease detection and interpretation. This underscores a gap between general-domain and medical-domain MLLMs: while general-domain MLLMs excel in broad applications, they lack the specialized knowledge necessary for precise diagnostic and interpretative tasks in the medical field. To address these challenges, we introduce \textit{RetinalGPT}, a multimodal conversational assistant for clinically preferred quantitative analysis of retinal images. Specifically, we achieve this by compiling a large retinal image dataset, developing a novel data pipeline, and employing customized visual instruction tuning to enhance both retinal analysis and enrich medical knowledge. In particular, RetinalGPT outperforms MLLM in the generic domain by a large margin in the diagnosis of retinal diseases in 8 benchmark retinal datasets. Beyond disease diagnosis, RetinalGPT features quantitative analyses and lesion localization, representing a pioneering step in leveraging LLMs for an interpretable and end-to-end clinical research framework. The code is available at https://github.com/Retinal-Research/RetinalGPT

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2503.03987

Country: North America > United States > California (0.14)

Genre:

Research Report > Experimental Study (0.67)
Research Report > New Finding (0.48)

Industry:

Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

EyeBench: A Call for More Rigorous Evaluation of Retinal Image Enhancement

Zhu, Wenhui, Dong, Xuanzhao, Li, Xin, Xiong, Yujian, Chen, Xiwen, Qiu, Peijie, Vasa, Vamsi Krishna, Yang, Zhangsihao, Su, Yi, Dumitrascu, Oana, Wang, Yalin

arXiv.org Artificial IntelligenceFeb-19-2025

Over the past decade, generative models have achieved significant success in enhancement fundus images.However, the evaluation of these models still presents a considerable challenge. A comprehensive evaluation benchmark for fundus image enhancement is indispensable for three main reasons: 1) The existing denoising metrics (e.g., PSNR, SSIM) are hardly to extend to downstream real-world clinical research (e.g., Vessel morphology consistency). 2) There is a lack of comprehensive evaluation for both paired and unpaired enhancement methods, along with the need for expert protocols to accurately assess clinical value. 3) An ideal evaluation system should provide insights to inform future developments of fundus image enhancement. To this end, we propose a novel comprehensive benchmark, EyeBench, to provide insights that align enhancement models with clinical needs, offering a foundation for future work to improve the clinical relevance and applicability of generative models for fundus image enhancement. EyeBench has three appealing properties: 1) multi-dimensional clinical alignment downstream evaluation: In addition to evaluating the enhancement task, we provide several clinically significant downstream tasks for fundus images, including vessel segmentation, DR grading, denoising generalization, and lesion segmentation. 2) Medical expert-guided evaluation design: We introduce a novel dataset that promote comprehensive and fair comparisons between paired and unpaired methods and includes a manual evaluation protocol by medical experts. 3) Valuable insights: Our benchmark study provides a comprehensive and rigorous evaluation of existing methods across different downstream tasks, assisting medical experts in making informed choices. Additionally, we offer further analysis of the challenges faced by existing methods. The code is available at \url{https://github.com/Retinal-Research/EyeBench}

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2502.1426

Country:

Europe > France (0.14)
Asia > China (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.87)

Industry:

Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.70)
(2 more...)

Add feedback

STA-Unet: Rethink the semantic redundant for Medical Imaging Segmentation

Vasa, Vamsi Krishna, Zhu, Wenhui, Chen, Xiwen, Qiu, Peijie, Dong, Xuanzhao, Wang, Yalin

arXiv.org Artificial IntelligenceOct-13-2024

In recent years, significant progress has been made in the medical image analysis domain using convolutional neural networks (CNNs). In particular, deep neural networks based on a U-shaped architecture (UNet) with skip connections have been adopted for several medical imaging tasks, including organ segmentation. Despite their great success, CNNs are not good at learning global or semantic features. Especially ones that require human-like reasoning to understand the context. Many UNet architectures attempted to adjust with the introduction of Transformer-based self-attention mechanisms, and notable gains in performance have been noted. However, the transformers are inherently flawed with redundancy to learn at shallow layers, which often leads to an increase in the computation of attention from the nearby pixels offering limited information. The recently introduced Super Token Attention (STA) mechanism adapts the concept of superpixels from pixel space to token space, using super tokens as compact visual representations. This approach tackles the redundancy by learning efficient global representations in vision transformers, especially for the shallow layers. In this work, we introduce the STA module in the UNet architecture (STA-UNet), to limit redundancy without losing rich information. Experimental results on four publicly available datasets demonstrate the superiority of STA-UNet over existing state-of-the-art architectures in terms of Dice score and IOU for organ segmentation tasks. The code is available at \url{https://github.com/Retinal-Research/STA-UNet}.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2410.11578

Genre: Research Report (0.82)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback