Goto

Collaborating Authors

 logp


Author Contributions

Neural Information Processing Systems

A.1 Deriving the Optimum of the KL-Constrained Reward Maximization Objective In this appendix, we will derive Eq. 4. Analogously to Eq. 3, we optimize the following objective: max


VSE: Variational state estimation of complex model-free process

Norén, Gustav, Ghosh, Anubhab, Cumlin, Fredrik, Chatterjee, Saikat

arXiv.org Machine Learning

We design a variational state estimation (VSE) method that provides a closed-form Gaussian posterior of an underlying complex dynamical process from (noisy) nonlinear measurements. The complex process is model-free. That is, we do not have a suitable physics-based model characterizing the temporal evolution of the process state. The closed-form Gaussian posterior is provided by a recurrent neural network (RNN). The use of RNN is computationally simple in the inference phase. For learning the RNN, an additional RNN is used in the learning phase. Both RNNs help each other learn better based on variational inference principles. The VSE is demonstrated for a tracking application - state estimation of a stochastic Lorenz system (a benchmark process) using a 2-D camera measurement model. The VSE is shown to be competitive against a particle filter that knows the Lorenz system model and a recently proposed data-driven state estimation method that does not know the Lorenz system model.


Disentangled representations via score-based variational autoencoders

Lyo, Benjamin S. H., Simoncelli, Eero P., Savin, Cristina

arXiv.org Machine Learning

We present the Score-based Autoencoder for Multiscale Inference (SAMI), a method for unsupervised representation learning that combines the theoretical frameworks of diffusion models and VAEs. By unifying their respective evidence lower bounds, SAMI formulates a principled objective that learns representations through score-based guidance of the underlying diffusion process. The resulting representations automatically capture meaningful structure in the data: it recovers ground truth generative factors in our synthetic dataset, learns factorized, semantic latent dimensions from complex natural images, and encodes video sequences into latent trajectories that are straighter than those of alternative encoders, despite training exclusively on static images. Furthermore, SAMI can extract useful representations from pre-trained diffusion models with minimal additional training. Finally, the explicitly probabilistic formulation provides new ways to identify semantically meaningful axes in the absence of supervised labels, and its mathematical exactness allows us to make formal statements about the nature of the learned representation. Overall, these results indicate that implicit structural information in diffusion models can be made explicit and interpretable through synergistic combination with a variational autoencoder.


Latent Nonlinear Denoising Score Matching for Enhanced Learning of Structured Distributions

Shen, Kaichen, Zhu, Wei

arXiv.org Machine Learning

We present latent nonlinear denoising score matching (LNDSM), a novel training objective for score-based generative models that integrates nonlinear forward dynamics with the VAE-based latent SGM framework. This combination is achieved by reformulating the cross-entropy term using the approximate Gaussian transition induced by the Euler-Maruyama scheme. To ensure numerical stability, we identify and remove two zero-mean but variance exploding terms arising from small time steps. Experiments on variants of the MNIST dataset demonstrate that the proposed method achieves faster synthesis and enhanced learning of inherently structured distributions. Compared to benchmark structure-agnostic latent SGMs, LNDSM consistently attains superior sample quality and variability.


Fusing Foveal Fixations Using Linear Retinal Transformations and Bayesian Experimental Design

Williams, Christopher K. I.

arXiv.org Artificial Intelligence

Humans (and many vertebrates) face the problem of fusing together multiple fixations of a scene in order to obtain a representation of the whole, where each fixation uses a high-resolution fovea and decreasing resolution in the periphery. In this paper we explicitly represent the retinal transformation of a fixation as a linear downsampling of a high-resolution latent image of the scene, exploiting the known geometry. This linear transformation allows us to carry out exact inference for the latent variables in factor analysis (FA) and mixtures of FA models of the scene. Further, this allows us to formulate and solve the choice of "where to look next" as a Bayesian experimental design problem using the Expected Information Gain criterion. Experiments on the Frey faces and MNIST datasets demonstrate the effectiveness of our models.


Proximal Approximate Inference in State-Space Models

Abdulsamad, Hany, García-Fernández, Ángel F., Särkkä, Simo

arXiv.org Artificial Intelligence

We present a class of algorithms for state estimation in nonlinear, non-Gaussian state-space models. Our approach is based on a variational Lagrangian formulation that casts Bayesian inference as a sequence of entropic trust-region updates subject to dynamic constraints. This framework gives rise to a family of forward-backward algorithms, whose structure is determined by the chosen factorization of the variational posterior. By focusing on Gauss--Markov approximations, we derive recursive schemes with favorable computational complexity. For general nonlinear, non-Gaussian models we close the recursions using generalized statistical linear regression and Fourier--Hermite moment matching.


Discovering Preference Optimization Algorithms with and for Large Language Models Chris Lu

Neural Information Processing Systems

Typically, preference optimization is approached as an offline supervised learning task using manually crafted convex loss functions. While these methods are based on theoretical insights, they are inherently constrained by human creativity, so the large search space of possible loss functions remains under-explored.


Accommodate Knowledge Conflicts in Retrieval-augmented LLMs: Towards Robust Response Generation in the Wild

Wang, Jiatai, Xu, Zhiwei, Jin, Di, Yang, Xuewen, Li, Tao

arXiv.org Artificial Intelligence

The proliferation of large language models (LLMs) has significantly advanced intelligent systems. Unfortunately, LLMs often face knowledge conflicts between internal memory and retrieved external information, arising from misinformation, biases, or outdated knowledge. These conflicts undermine response reliability and introduce uncertainty in decision-making. In this work, we analyze how LLMs navigate knowledge conflicts from an information-theoretic perspective and reveal that when conflicting and supplementary information exhibit significant differences, LLMs confidently resolve their preferences and alleviate the uncertainty during their response generation. When this difference is ambiguous, LLMs experience considerable uncertainty about their generation. Based on this insight, we propose Swin-VIB, a novel framework that integrates a pipeline of variational information bottleneck models to adapt the retrieved information difference, facilitating robust response generation of LLMs even in conflicting contexts. Extensive experiments confirm our theoretical analysis and demonstrate the performance of Swin-VIB. Notably, Swin-VIB outperforms all competitive baselines in terms of the accuracy of the multiple-choice task, while improving the EM values in the open-ended QA task by at least 11.14%.


f9b9f0fef2274a6b7009b5d52f44a3b6-AuthorFeedback.pdf

Neural Information Processing Systems

The fundamental difference is between "many to one" Figure 1 shows example generations from the model trained on the ChEMBL. W e actually did run an RL baseline (Eq. W e discuss the work of Norouzi et al. [2016] in detail in Section 3.3. They also do not use the entropy term in training, only to motivate derivations.


A Algorithmic details A.1 Problem formulation for model repairment

Neural Information Processing Systems

This section extends the discussion on the problem formulation of model repairment. There are further considerations for executing and evaluating model repairment in practice. Generalisation and efficiency of model repairment In practice the number of failure cases might be large or even infinite. Gaussian noise may also fail on all the other inputs with such level of noise. One type of mistake might incur more costs than others (e.g., in medical applications, false Instead we present a "predictive" approach that removes this computational burden.