Goto

Collaborating Authors

 Zhao, Linxi


Mitigating Object Hallucination in Large Vision-Language Models via Classifier-Free Guidance

arXiv.org Artificial Intelligence

The advancement of Large Vision-Language Models (LVLMs) has increasingly highlighted the critical issue of their tendency to hallucinate non-existing objects in the images. To address this issue, previous works focused on using specially curated datasets or powerful LLMs (e.g., GPT-3.5) to rectify the outputs of LVLMs. However, these approaches require either expensive training/fine-tuning or API access to advanced LLMs to correct the model's output post-generation. In this paper, we tackle this challenge by introducing a framework called Mitigating hallucinAtion via classifieR-Free guIdaNcE (MARINE), which is both training-free and API-free, and can effectively and efficiently reduce object hallucinations during the generation process. Specifically, MARINE enriches the visual context of LVLMs by integrating existing open-source vision models, and employs classifier-free guidance to incorporate the additional object grounding features to improve the precision of LVLMs' generations. Through comprehensive evaluations across $6$ popular LVLMs with diverse evaluation metrics, we demonstrate the effectiveness of MARINE, which even outperforms existing fine-tuning-based methods. Remarkably, it not only reduces hallucinations but also improves the detailedness of LVLMs' generations, as assessed by GPT-4V.


A Comprehensive Dataset and Automated Pipeline for Nailfold Capillary Analysis

arXiv.org Artificial Intelligence

The introduction of machine learning marks a pivotal shift, presenting Nailfold capillaroscopy is a well-established method for automated medical image analysis as a promising alternative assessing health conditions, but the untapped potential of automated due to its higher accuracy compared to traditional image medical image analysis using machine learning remains processing algorithms[5]. Recent studies have attempted to despite recent advancements. In this groundbreaking use single deep-learning models for tasks such as nailfold study, we present a pioneering effort in constructing a comprehensive capillary segmentation[4, 8], measurement of capillary size dataset--321 images, 219 videos, 68 clinic reports, and density[5], and white cell counting[9]. Despite notable with expert annotations--that serves as a crucial resource achievements, the untapped potential of automated medical for training deep-learning models. Leveraging this image analysis persists due to the urgent need for annotated dataset, we propose an end-to-end nailfold capillary analysis and extensive datasets essential for effective training and pipeline capable of automatically detecting and measuring diverse fine-tuning deep neural networks.