AITopics | Zhu, Younan

Collaborating Authors

Zhu, Younan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Mitigating Object Hallucinations in Large Vision-Language Models via Attention Calibration

Zhu, Younan, Tao, Linwei, Dong, Minjing, Xu, Chang

arXiv.org Artificial IntelligenceFeb-3-2025

Large Vision-Language Models (LVLMs) exhibit impressive multimodal reasoning capabilities but remain highly susceptible to object hallucination, where models generate responses that are not factually aligned with the visual content. Recent works attribute this issue to an inherent bias of LVLMs where vision token attention map has a fixed correlation with spatial position, and propose to mitigate this issue by reordering visual tokens. However, we find that different LVLMs exhibit different correlations between attention and spatial position, which makes the existing solution difficult to generalize to other LVLMs. To address this issue, we first introduce a training-free solution, Uniform Attention Calibration (UAC), that estimates the bias from single meaningless input image and applies a calibration matrix to rectify attention imbalances. To further alleviate the bias, we relax the assumption of single meaningless input in UAC and introduce a fine-tuning solution, Dynamic Attention Calibration (DAC), that enforces the consistent outputs wherever the object locates in the image via a plug-and-plays module. Comprehensive experiments across multiple benchmarks demonstrate that UAC and DAC significantly reduce object hallucination while improving general multimodal alignment. Our methods achieve state-of-the-art performance across diverse LVLM architectures on various metrics.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2502.01969

Country:

North America (0.28)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)

Add feedback

A Benchmark Study on Calibration

Tao, Linwei, Zhu, Younan, Guo, Haolan, Dong, Minjing, Xu, Chang

arXiv.org Machine LearningOct-10-2023

Deep neural networks are increasingly utilized in various machine learning tasks. However, as these models grow in complexity, they often face calibration issues, despite enhanced prediction accuracy. Many studies have endeavored to improve calibration performance through the use of specific loss functions, data preprocessing and training frameworks. Yet, investigations into calibration properties have been somewhat overlooked. We specifically create a model calibration dataset. This dataset evaluates 90 bin-based and 12 additional calibration measurements across 117,702 unique neural networks within the widely employed NATS-Bench search space. Our analysis aims to answer several longstanding questions in the field, using our proposed dataset: (i) Can model calibration be generalized across different datasets? By providing this dataset, we enable further research into NAS calibration. As far as we are aware, our research represents the first large-scale investigation into calibration properties and the premier study of calibration issues within NAS. Despite their widespread success across various domains, deep neural networks (DNNs) are not immune to producing miscalibrated predictions, leading to either overconfidence or underconfidence. This concern becomes particularly salient for safety-critical applications such as autonomous driving (Feng et al., 2019) and medical diagnosis (Thiagarajan et al., 2022), where reliance on accurate prediction probabilities is paramount. In these contexts, miscalibrated predictions may give rise to potentially catastrophic consequences.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Machine Learning

2308.11838

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology (0.48)
Health & Medicine > Diagnostic Medicine (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback