mmb
Calibrating MLLM-as-a-judge via Multimodal Bayesian Prompt Ensembles
Slyman, Eric, Tanjim, Mehrab, Kafle, Kushal, Lee, Stefan
Multimodal large language models (MLLMs) are increasingly used to evaluate text-to-image (TTI) generation systems, providing automated judgments based on visual and textual context. However, these "judge" models often suffer from biases, overconfidence, and inconsistent performance across diverse image domains. While prompt en-sembling has shown promise for mitigating these issues in unimodal, text-only settings, our experiments reveal that standard ensembling methods fail to generalize effectively for TTI tasks. T o address these limitations, we propose a new multimodal-aware method called Multimodal Mixture-of-Bayesian Prompt Ensembles (MMB). Our method uses a Bayesian prompt ensemble approach augmented by image clustering, allowing the judge to dynamically assign prompt weights based on the visual characteristics of each sample. W e show that MMB improves accuracy in pairwise preference judgments and greatly enhances calibration, making it easier to gauge the judge's true uncertainty. In evaluations on two TTI benchmarks, HPSv2 and MJBench, MMB outperforms existing baselines in alignment with human annotations and calibration across varied image content. Our findings highlight the importance of multimodal-specific strategies for judge calibration and suggest a promising path forward for reliable large-scale TTI evaluation.
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- North America > United States > Oregon (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Asia > Middle East > Jordan (0.04)
Local Learning for Covariate Selection in Nonparametric Causal Effect Estimation with Latent Variables
Li, Zheng, Xie, Feng, Zeng, Yan, Geng, Zhi
Estimating causal effects from nonexperimental data is a fundamental problem in many fields of science. A key component of this task is selecting an appropriate set of covariates for confounding adjustment to avoid bias. Most existing methods for covariate selection often assume the absence of latent variables and rely on learning the global network structure among variables. However, identifying the global structure can be unnecessary and inefficient, especially when our primary interest lies in estimating the effect of a treatment variable on an outcome variable. To address this limitation, we propose a novel local learning approach for covariate selection in nonparametric causal effect estimation, which accounts for the presence of latent variables. Our approach leverages testable independence and dependence relationships among observed variables to identify a valid adjustment set for a target causal relationship, ensuring both soundness and completeness under standard assumptions. We validate the effectiveness of our algorithm through extensive experiments on both synthetic and real-world data.
- Asia > China > Beijing > Beijing (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Pennsylvania (0.04)
- Education (0.84)
- Health & Medicine (0.67)
Towards Robust Visual Question Answering: Making the Most of Biased Samples via Contrastive Learning
Si, Qingyi, Liu, Yuanxin, Meng, Fandong, Lin, Zheng, Fu, Peng, Cao, Yanan, Wang, Weiping, Zhou, Jie
Models for Visual Question Answering (VQA) often rely on the spurious correlations, i.e., the language priors, that appear in the biased samples of training set, which make them brittle against the out-of-distribution (OOD) test data. Recent methods have achieved promising progress in overcoming this problem by reducing the impact of biased samples on model training. However, these models reveal a trade-off that the improvements on OOD data severely sacrifice the performance on the in-distribution (ID) data (which is dominated by the biased samples). Therefore, we propose a novel contrastive learning approach, MMBS, for building robust VQA models by Making the Most of Biased Samples. Specifically, we construct positive samples for contrastive learning by eliminating the information related to spurious correlation from the original training samples and explore several strategies to use the constructed positive samples for training. Instead of undermining the importance of biased samples in model training, our approach precisely exploits the biased samples for unbiased information that contributes to reasoning. The proposed method is compatible with various VQA backbones. We validate our contributions by achieving competitive performance on the OOD dataset VQA-CP v2 while preserving robust performance on the ID dataset VQA v2.
Adaptation of a Mixture of Multivariate Bernoulli Distributions
Kamthe, Ankur (University of California, Merced) | Carreira-Perpinan, Miguel Angel (University of California, Merced) | Cerpa, Alberto E. (University of California, Merced)
The mixture of multivariate Bernoulli distributions (MMB) is a statistical model for high-dimensional binary data in widespread use. Recently, the MMB has been used to model the sequence of packet receptions and losses of wireless links in sensor networks. Given an MMB trained on long data traces recorded from links of a deployed network, one can then use samples from the MMB to test different routing algorithms for as long as desired. However, learning an accurate model for a new link requires collecting from it long traces over periods of hours, a costly process in practice (e.g. limited battery life). We propose an algorithm that can adapt a preexisting MMB trained with extensive data to a new link from which very limited data is available. Our approach constrains the new MMB's parameters through a nonlinear transformation of the existing MMB's parameters. The transformation has a small number of parameters that are estimated using a generalized EM algorithm with an inner loop of BFGS iterations. We demonstrate the efficacy of the approach using the MNIST dataset of handwritten digits, and wireless link data from a sensor network. We show we can learn accurate models from data traces of about 1 minute, about 10 times shorter than needed if training an MMB from scratch.
- North America > United States > California > Merced County > Merced (0.04)
- North America > Greenland (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.95)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
- Information Technology > Communications > Networks (0.88)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Mixed Membership Stochastic Blockmodels
Airoldi, Edo M., Blei, David M., Fienberg, Stephen E., Xing, Eric P.
In many settings, such as protein interactions and gene regulatory networks, collections ofauthor-recipient email, and social networks, the data consist of pairwise measurements, e.g., presence or absence of links between pairs of objects. Analyzing such data with probabilistic models requires nonstandard assumptions, since the usual independence or exchangeability assumptions no longer hold. In this paper, we introduce a class of latent variable models for pairwise measurements: mixedmembership stochastic blockmodels. Models in this class combine a global model of dense patches of connectivity (blockmodel) with a local model to instantiate node-specific variability in the connections (mixed membership). We develop a general variational inference algorithm for fast approximate posterior inference.We demonstrate the advantages of mixed membership stochastic blockmodel with applications to social networks and protein interaction networks.
- Asia > Middle East > Jordan (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)