Belief Revision
On Uncertainty in Deep State Space Models for Model-Based Reinforcement Learning
Becker, Philipp, Neumann, Gerhard
Improved state space models, such as Recurrent State Space Models (RSSMs), are a key factor behind recent advances in model-based reinforcement learning (RL). Yet, despite their empirical success, many of the underlying design choices are not well understood. We show that RSSMs use a suboptimal inference scheme and that models trained using this inference overestimate the aleatoric uncertainty of the ground truth system. We find this overestimation implicitly regularizes RSSMs and allows them to succeed in model-based RL. We postulate that this implicit regularization fulfills the same functionality as explicitly modeling epistemic uncertainty, which is crucial for many other model-based RL approaches. Yet, overestimating aleatoric uncertainty can also impair performance in cases where accurately estimating it matters, e.g., when we have to deal with occlusions, missing observations, or fusing sensor modalities at different frequencies. Moreover, the implicit regularization is a side-effect of the inference scheme and not the result of a rigorous, principled formulation, which renders analyzing or improving RSSMs difficult. Thus, we propose an alternative approach building on well-understood components for modeling aleatoric and epistemic uncertainty, dubbed Variational Recurrent Kalman Network (VRKN). This approach uses Kalman updates for exact smoothing inference in a latent space and Monte Carlo Dropout to model epistemic uncertainty. Due to the Kalman updates, the VRKN can naturally handle missing observations or sensor fusion problems with varying numbers of observations per time step. Our experiments show that using the VRKN instead of the RSSM improves performance in tasks where appropriately capturing aleatoric uncertainty is crucial while matching it in the deterministic standard benchmarks.
Probabilistic modeling of rational communication with conditionals
Grusdt, Britta, Lassiter, Daniel, Franke, Michael
While a large body of work has scrutinized the meaning of conditional sentences, considerably less attention has been paid to formal models of their pragmatic use and interpretation. Here, we take a probabilistic approach to pragmatic reasoning about indicative conditionals which flexibly integrates gradient beliefs about richly structured world states. We model listeners' update of their prior beliefs about the causal structure of the world and the joint probabilities of the consequent and antecedent based on assumptions about the speaker's utterance production protocol. We show that, when supplied with natural contextual assumptions, our model uniformly explains a number of inferences attested in the literature, including epistemic inferences, conditional perfection and the dependency between antecedent and consequent of a conditional. We argue that this approach also helps explain three puzzles introduced by Douven (2012) about updating with conditionals: depending on the utterance context, the listener's belief in the antecedent may increase, decrease or remain unchanged.
Belief propagation generalizes backpropagation
The two most important algorithms in artificial intelligence are backpropagation and belief propagation. In spite of their importance, the connection between them is poorly characterized. We show that when an input to backpropagation is converted into an input to belief propagation so that (loopy) belief propagation can be run on it, then the result of belief propagation encodes the result of backpropagation; thus backpropagation is recovered as a special case of belief propagation. In other words, we prove for apparently the first time that belief propagation generalizes backpropagation. Our analysis is a theoretical contribution, which we motivate with the expectation that it might reconcile our understandings of each of these algorithms, and serve as a guide to engineering researchers seeking to improve the behavior of systems that use one or the other.
A Doxastic Characterisation of Autonomous Decisive Systems
A highly autonomous system (HAS) has to assess the situation it is in and derive beliefs, based on which, it decides what to do next. The beliefs are not solely based on the observations the HAS has made so far, but also on general insights about the world, in which the HAS operates. These insights have either been built in the HAS during design or are provided by trusted sources during its mission. Although its beliefs may be imprecise and might bear flaws, the HAS will have to extrapolate the possible futures in order to evaluate the consequences of its actions and then take its decisions autonomously. In this paper, we formalize an autonomous decisive system as a system that always chooses actions that it currently believes are the best. We show that it can be checked whether an autonomous decisive system can be built given an application domain, the dynamically changing knowledge base and a list of LTL mission goals. We moreover can synthesize a belief formation for an autonomous decisive system. For the formal characterization, we use a doxastic framework for safety-critical HASs where the belief formation supports the HAS's extrapolation.
Deep Attentive Belief Propagation: Integrating Reasoning and Learning for Solving Constraint Optimization Problems
Deng, Yanchen, Kong, Shufeng, Liu, Caihua, An, Bo
Belief Propagation (BP) is an important message-passing algorithm for various reasoning tasks over graphical models, including solving the Constraint Optimization Problems (COPs). It has been shown that BP can achieve state-of-the-art performance on various benchmarks by mixing old and new messages before sending the new one, i.e., damping. However, existing methods of tuning a static damping factor for BP not only are laborious but also harm their performance. Moreover, existing BP algorithms treat each variable node's neighbors equally when composing a new message, which also limits their exploration ability. To address these issues, we seamlessly integrate BP, Gated Recurrent Units (GRUs), and Graph Attention Networks (GATs) within the message-passing framework to reason about dynamic weights and damping factors for composing new BP messages. Our model, Deep Attentive Belief Propagation (DABP), takes the factor graph and the BP messages in each iteration as the input and infers the optimal weights and damping factors through GRUs and GATs, followed by a multi-head attention layer. Furthermore, unlike existing neural-based BP variants, we propose a novel self-supervised learning algorithm for DABP with a smoothed solution cost, which does not require expensive training labels and also avoids the common out-of-distribution issue through efficient online learning. Extensive experiments show that our model significantly outperforms state-of-the-art baselines.
On resolving conflicts between arguments
Argument systems are based on the idea that one can construct arguments for propositions; i.e., structured reasons justifying the belief in a proposition. Using defeasible rules, arguments need not be valid in all circumstances, therefore, it might be possible to construct an argument for a proposition as well as its negation. When arguments support conflicting propositions, one of the arguments must be defeated, which raises the question of \emph{which (sub-)arguments can be subject to defeat}? In legal argumentation, meta-rules determine the valid arguments by considering the last defeasible rule of each argument involved in a conflict. Since it is easier to evaluate arguments using their last rules, \emph{can a conflict be resolved by considering only the last defeasible rules of the arguments involved}? We propose a new argument system where, instead of deriving a defeat relation between arguments, \emph{undercutting-arguments} for the defeat of defeasible rules are constructed. This system allows us, (\textit{i}) to resolve conflicts (a generalization of rebutting arguments) using only the last rules of the arguments for inconsistencies, (\textit{ii}) to determine a set of valid (undefeated) arguments in linear time using an algorithm based on a JTMS, (\textit{iii}) to establish a relation with Default Logic, and (\textit{iv}) to prove closure properties such as \emph{cumulativity}. We also propose an extension of the argument system that enables \emph{reasoning by cases}.
Belief Revision based Caption Re-ranker with Visual Semantic Information
Sabir, Ahmed, Moreno-Noguer, Francesc, Madhyastha, Pranava, Padrรณ, Lluรญs
In this work, we focus on improving the captions generated by image-caption generation systems. We propose a novel re-ranking approach that leverages visual-semantic measures to identify the ideal caption that maximally captures the visual information in the image. Our re-ranker utilizes the Belief Revision framework (Blok et al., 2003) to calibrate the original likelihood of the top-n captions by explicitly exploiting the semantic relatedness between the depicted caption and the visual context. Our experiments demonstrate the utility of our approach, where we observe that our re-ranker can enhance the performance of a typical image-captioning system without the necessity of any additional training or fine-tuning.
MIntRec: A New Dataset for Multimodal Intent Recognition
Zhang, Hanlei, Xu, Hua, Wang, Xin, Zhou, Qianrui, Zhao, Shaojie, Teng, Jiayan
Multimodal intent recognition is a significant task for understanding human language in real-world multimodal scenes. Most existing intent recognition methods have limitations in leveraging the multimodal information due to the restrictions of the benchmark datasets with only text information. This paper introduces a novel dataset for multimodal intent recognition (MIntRec) to address this issue. It formulates coarse-grained and fine-grained intent taxonomies based on the data collected from the TV series Superstore. The dataset consists of 2,224 high-quality samples with text, video, and audio modalities and has multimodal annotations among twenty intent categories. Furthermore, we provide annotated bounding boxes of speakers in each video segment and achieve an automatic process for speaker annotation. MIntRec is helpful for researchers to mine relationships between different modalities to enhance the capability of intent recognition. We extract features from each modality and model cross-modal interactions by adapting three powerful multimodal fusion methods to build baselines. Extensive experiments show that employing the non-verbal modalities achieves substantial improvements compared with the text-only modality, demonstrating the effectiveness of using multimodal information for intent recognition. The gap between the best-performing methods and humans indicates the challenge and importance of this task for the community. The full dataset and codes are available for use at https://github.com/thuiar/MIntRec.
Understanding the Behavior of Belief Propagation
Probabilistic graphical models are a powerful concept for modeling high-dimensional distributions. Besides modeling distributions, probabilistic graphical models also provide an elegant framework for performing statistical inference; because of the high-dimensional nature, however, one must often use approximate methods for this purpose. Belief propagation performs approximate inference, is efficient, and looks back on a long success-story. Yet, in most cases, belief propagation lacks any performance and convergence guarantees. Many realistic problems are presented by graphical models with loops, however, in which case belief propagation is neither guaranteed to provide accurate estimates nor that it converges at all. This thesis investigates how the model parameters influence the performance of belief propagation. We are particularly interested in their influence on (i) the number of fixed points, (ii) the convergence properties, and (iii) the approximation quality.
A taxonomy of surprise definitions
Modirshanechi, Alireza, Brea, Johanni, Gerstner, Wulfram
Surprising events trigger measurable brain activity and influence human behavior by affecting learning, memory, and decision-making. Currently there is, however, no consensus on the definition of surprise. Here we identify 18 mathematical definitions of surprise in a unifying framework. We first propose a technical classification of these definitions into three groups based on their dependence on an agent's belief, show how they relate to each other, and prove under what conditions they are indistinguishable. Going beyond this technical analysis, we propose a taxonomy of surprise definitions and classify them into four conceptual categories based on the quantity they measure: (i) 'prediction surprise' measures a mismatch between a prediction and an observation; (ii) 'change-point detection surprise' measures the probability of a change in the environment; (iii) 'confidence-corrected surprise' explicitly accounts for the effect of confidence; and (iv) 'information gain surprise' measures the belief-update upon a new observation. The taxonomy poses the foundation for principled studies of the functional roles and physiological signatures of surprise in the brain.