Goto

Collaborating Authors

 vcd


On the Recursive Teaching Dimension of VC Classes

Neural Information Processing Systems

The recursive teaching dimension (RTD) of a concept class $C \subseteq \{0, 1\}^n$, introduced by Zilles et al. [ZLHZ11], is a complexity parameter measured by the worst-case number of labeled examples needed to learn any target concept of $C$ in the recursive teaching model. In this paper, we study the quantitative relation between RTD and the well-known learning complexity measure VC dimension (VCD), and improve the best known upper and (worst-case) lower bounds on the recursive teaching dimension with respect to the VC dimension. Given a concept class $C \subseteq \{0, 1\}^n$ with $VCD(C) = d$, we first show that $RTD(C)$ is at most $d 2^{d+1}$. This is the first upper bound for $RTD(C)$ that depends only on $VCD(C)$, independent of the size of the concept class $|C|$ and its~domain size $n$.



On Batch Teaching with Sample Complexity Bounded by VCD

Neural Information Processing Systems

In machine teaching, a concept is represented by (and inferred from) a small number of labeled examples. Various teaching models in the literature cast the interaction between teacher and learner in a way to obtain a small complexity (in terms of the number of examples required for teaching a concept) while obeying certain constraints that are meant to prevent unfair collusion between teacher and learner. In recent years, one major research goal has been to show interesting relationships between teaching complexity and the VC-dimension (VCD). So far, the only interesting relationship known from batch teaching settings is an upper bound quadratic in the VCD, on a parameter called recursive teaching dimension. The only known upper bound on teaching complexity that is linear in VCD was obtained in a model of teaching with sequences rather than batches.This paper is the first to provide an upper bound of VCD on a batch teaching complexity parameter. This parameter, called STDmin, is introduced here as a model of teaching that intuitively incorporates a notion of ``importance'' of an example for a concept. In designing the STDmin teaching model, we argue that the standard notion of collusion-freeness from the literature may be inadequate for certain applications; we hence propose three desirable properties of teaching complexity and demonstrate that they are satisfied by STDmin.


On the Recursive Teaching Dimension of VC Classes

Neural Information Processing Systems

The recursive teaching dimension (RTD) of a concept class $C \subseteq \{0, 1\}^n$, introduced by Zilles et al. [ZLHZ11], is a complexity parameter measured by the worst-case number of labeled examples needed to learn any target concept of $C$ in the recursive teaching model. In this paper, we study the quantitative relation between RTD and the well-known learning complexity measure VC dimension (VCD), and improve the best known upper and (worst-case) lower bounds on the recursive teaching dimension with respect to the VC dimension. Given a concept class $C \subseteq \{0, 1\}^n$ with $VCD(C) = d$, we first show that $RTD(C)$ is at most $d 2^{d+1}$. This is the first upper bound for $RTD(C)$ that depends only on $VCD(C)$, independent of the size of the concept class $|C|$ and its~domain size $n$.




CLAIM: Mitigating Multilingual Object Hallucination in Large Vision-Language Models with Cross-Lingual Attention Intervention

Ye, Zekai, Li, Qiming, Feng, Xiaocheng, Qin, Libo, Huang, Yichong, Li, Baohang, Jiang, Kui, Xiang, Yang, Zhang, Zhirui, Lu, Yunfei, Tang, Duyu, Tu, Dandan, Qin, Bing

arXiv.org Artificial Intelligence

Large Vision-Language Models (LVLMs) have demonstrated impressive multimodal abilities but remain prone to multilingual object hallucination, with a higher likelihood of generating responses inconsistent with the visual input when utilizing queries in non-English languages compared to English. Most existing approaches to address these rely on pretraining or fine-tuning, which are resource-intensive. In this paper, inspired by observing the disparities in cross-modal attention patterns across languages, we propose Cross-Lingual Attention Intervention for Mitigating multilingual object hallucination (CLAIM) in LVLMs, a novel near training-free method by aligning attention patterns. CLAIM first identifies language-specific cross-modal attention heads, then estimates language shift vectors from English to the target language, and finally intervenes in the attention outputs during inference to facilitate cross-lingual visual perception capability alignment. Extensive experiments demonstrate that CLAIM achieves an average improvement of 13.56% (up to 30% in Spanish) on the POPE and 21.75% on the hallucination subsets of the MME benchmark across various languages. Further analysis reveals that multilingual attention divergence is most prominent in intermediate layers, highlighting their critical role in multilingual scenarios.


Reviews: On the Recursive Teaching Dimension of VC Classes

Neural Information Processing Systems

The paper is very insightful - the authors quite nicely explain the approach they took for proving their results. The questions addressed, while interesting only for a fairly small subcommunity of the machine learning community, are really important in that subcommunity, and the authors have achieved a substantial breakthrough on an open problem posed in COLT 2015. I quite liked the idea to formulate the problem of finding a concept class with RTD 3/2 VCD as a SAT problem. In my eyes, the results should definitely be published, and they are important enough to deserve publication in a leading venue like NIPS. The paper is generally well written and easy to read, but there are a few minor (easy to fix issues (mostly just typos etc).


Magnifier Prompt: Tackling Multimodal Hallucination via Extremely Simple Instructions

Fu, Yuhan, Xie, Ruobing, Liu, Jiazhen, Lan, Bangxiang, Sun, Xingwu, Kang, Zhanhui, Li, Xirong

arXiv.org Artificial Intelligence

Hallucinations in multimodal large language models (MLLMs) hinder their practical applications. To address this, we propose a Magnifier Prompt (MagPrompt), a simple yet effective method to tackle hallucinations in MLLMs via extremely simple instructions. MagPrompt is based on the following two key principles, which guide the design of various effective prompts, demonstrating robustness: (1) MLLMs should focus more on the image. (2) When there are conflicts between the image and the model's inner knowledge, MLLMs should prioritize the image. MagPrompt is training-free and can be applied to open-source and closed-source models, such as GPT-4o and Gemini-pro. It performs well across many datasets and its effectiveness is comparable or even better than more complex methods like VCD. Furthermore, our prompt design principles and experimental analyses provide valuable insights into multimodal hallucination.


On Batch Teaching with Sample Complexity Bounded by VCD

Neural Information Processing Systems

In machine teaching, a concept is represented by (and inferred from) a small number of labeled examples. Various teaching models in the literature cast the interaction between teacher and learner in a way to obtain a small complexity (in terms of the number of examples required for teaching a concept) while obeying certain constraints that are meant to prevent unfair collusion between teacher and learner. In recent years, one major research goal has been to show interesting relationships between teaching complexity and the VC-dimension (VCD). So far, the only interesting relationship known from batch teaching settings is an upper bound quadratic in the VCD, on a parameter called recursive teaching dimension. The only known upper bound on teaching complexity that is linear in VCD was obtained in a model of teaching with sequences rather than batches.This paper is the first to provide an upper bound of VCD on a batch teaching complexity parameter.