Uncertainty
Multigrid with rough coefficients and Multiresolution operator decomposition from Hierarchical Information Games
We introduce a near-linear complexity (geometric and meshless/algebraic) multigrid/multiresolution method for PDEs with rough ($L^\infty$) coefficients with rigorous a-priori accuracy and performance estimates. The method is discovered through a decision/game theory formulation of the problems of (1) identifying restriction and interpolation operators (2) recovering a signal from incomplete measurements based on norm constraints on its image under a linear operator (3) gambling on the value of the solution of the PDE based on a hierarchy of nested measurements of its solution or source term. The resulting elementary gambles form a hierarchy of (deterministic) basis functions of $H^1_0(\Omega)$ (gamblets) that (1) are orthogonal across subscales/subbands with respect to the scalar product induced by the energy norm of the PDE (2) enable sparse compression of the solution space in $H^1_0(\Omega)$ (3) induce an orthogonal multiresolution operator decomposition. The operating diagram of the multigrid method is that of an inverted pyramid in which gamblets are computed locally (by virtue of their exponential decay), hierarchically (from fine to coarse scales) and the PDE is decomposed into a hierarchy of independent linear systems with uniformly bounded condition numbers. The resulting algorithm is parallelizable both in space (via localization) and in bandwith/subscale (subscales can be computed independently from each other). Although the method is deterministic it has a natural Bayesian interpretation under the measure of probability emerging (as a mixed strategy) from the information game formulation and multiresolution approximations form a martingale with respect to the filtration induced by the hierarchy of nested measurements.
Network Maximal Correlation
Feizi, Soheil, Makhdoumi, Ali, Duffy, Ken, Medard, Muriel, Kellis, Manolis
We introduce Network Maximal Correlation (NMC) as a multivariate measure of nonlinear association among random variables. NMC is defined via an optimization that infers transformations of variables by maximizing aggregate inner products between transformed variables. For finite discrete and jointly Gaussian random variables, we characterize a solution of the NMC optimization using basis expansion of functions over appropriate basis functions. For finite discrete variables, we propose an algorithm based on alternating conditional expectation to determine NMC. Moreover we propose a distributed algorithm to compute an approximation of NMC for large and dense graphs using graph partitioning. For finite discrete variables, we show that the probability of discrepancy greater than any given level between NMC and NMC computed using empirical distributions decays exponentially fast as the sample size grows. For jointly Gaussian variables, we show that under some conditions the NMC optimization is an instance of the Max-Cut problem. We then illustrate an application of NMC in inference of graphical model for bijective functions of jointly Gaussian variables. Finally, we show NMC's utility in a data application of learning nonlinear dependencies among genes in a cancer dataset.
Cooperative Training of Descriptor and Generator Networks
Xie, Jianwen, Lu, Yang, Gao, Ruiqi, Zhu, Song-Chun, Wu, Ying Nian
This paper studies the cooperative training of two probabilistic models of signals such as images. Both models are parametrized by convolutional neural networks (ConvNets). The first network is a descriptor network, which is an exponential family model or an energy-based model, whose feature statistics or energy function are defined by a bottom-up ConvNet, which maps the observed signal to the feature statistics. The second network is a generator network, which is a non-linear version of factor analysis. It is defined by a top-down ConvNet, which maps the latent factors to the observed signal. The maximum likelihood training algorithms of both the descriptor net and the generator net are in the form of alternating back-propagation, and both algorithms involve Langevin sampling. We observe that the two training algorithms can cooperate with each other by jumpstarting each other's Langevin sampling, and they can be naturally and seamlessly interwoven into a CoopNets algorithm that can train both nets simultaneously.
2k6zLr8
Bayesian inference is a way to get sharper predictions from your data. It's particularly useful when you don't have as much data as you would like and want to juice every last bit of predictive strength from it. Although it is sometimes described with reverence, Bayesian inference isn't magic or mystical. And even though the math under the hood can get dense, the concepts behind it are completely accessible. In brief, Bayesian inference lets you draw stronger conclusions from your data by folding in what you already know about the answer. Bayesian inference is based on the ideas of Thomas Bayes, a nonconformist Presbyterian minister in London about 300 years ago. He wrote two books, one on theology, and one on probability. His work included his now famous Bayes Theorem in raw form, which has since been applied to the problem of inference, the technical term for educated guessing. The popularity of Bayes' ideas was aided immeasurably by another minister, Richard Price. He saw their significance, refined them and published them. It would be more accurate and historically just to call Bayes' Theorem the Bayes-Price Rule.
Model-based Classification and Novelty Detection For Point Pattern Data
Vo, Ba-Ngu, Tran, Quang N., Phung, Dinh, Vo, Ba-Tuong
Point patterns are sets or multi-sets of unordered elements that can be found in numerous data sources. However, in data analysis tasks such as classification and novelty detection, appropriate statistical models for point pattern data have not received much attention. This paper proposes the modelling of point pattern data via random finite sets (RFS). In particular, we propose appropriate likelihood functions, and a maximum likelihood estimator for learning a tractable family of RFS models. In novelty detection, we propose novel ranking functions based on RFS models, which substantially improve performance.
Coresets for Scalable Bayesian Logistic Regression
Huggins, Jonathan H., Campbell, Trevor, Broderick, Tamara
The use of Bayesian methods in large-scale data settings is attractive because of the rich hierarchical models, uncertainty quantification, and prior specification they provide. Standard Bayesian inference algorithms are computationally expensive, however, making their direct application to large datasets difficult or infeasible. Recent work on scaling Bayesian inference has focused on modifying the underlying algorithms to, for example, use only a random data subsample at each iteration. We leverage the insight that data is often redundant to instead obtain a weighted subset of the data (called a coreset) that is much smaller than the original dataset. We can then use this small coreset in any number of existing posterior inference algorithms without modification. In this paper, we develop an efficient coreset construction algorithm for Bayesian logistic regression models. We provide theoretical guarantees on the size and approximation quality of the coreset -- both for fixed, known datasets, and in expectation for a wide class of data generative models. Crucially, the proposed approach also permits efficient construction of the coreset in both streaming and parallel settings, with minimal additional effort. We demonstrate the efficacy of our approach on a number of synthetic and real-world datasets, and find that, in practice, the size of the coreset is independent of the original dataset size. Furthermore, constructing the coreset takes a negligible amount of time compared to that required to run MCMC on it.
DMOZ - Computers: Artificial Intelligence: Companies
Includes profile, demo downloads, and job openings. Developer of software systems that solve resource optimization, planning, scheduling, and deployment problems for the air transportation, gaming, healthcare, hospitality, and security industries. Source for neural network based data modeling, prediction, forecasting and optimization solutions. Areas of focus includes: Banking and Finance, Manufacturing, Marketing, Medical. Uses artificial-intelligence technologies to prevent fraud in transaction environments such as finance, e-commerce, telecommunications, and insurance.
Abstracting from Observation-Equivalent Entities in Human Behavior Modeling
Schrรถder, Max (University of Rostock) | Lรผdtke, Stefan (University of Rostock) | Bader, Sebastian (University of Rostock) | Krรผger, Frank (University of Rostock ) | Kirste, Thomas (University of Rostock)
Recognizing human behavior from noisy and ambiguous sensor data is a prerequisite for many applications such as context-aware assistance. The sensor data, however, often do not allow to distinguish between multiple entities, e.g. a presence sensor does not allow to distinguish between two persons i.e. both are observation-equivalent. Conventional algorithms, however, consider each of these entities separately during the inference of human behavior, leading to a high computational burden in scenarios where a large number of entities have to be considered. Therefore, these algorithms can only be applied to very limited scenarios. We analyzed the challenges appearing in these scenarios and revealed that considering observation-equivalent entities separately is one reason for the huge computational effort. Thus, we propose to exploit observation-equivalence by representing entities as a group and inferring about these groups of entities. We sketch a mechanism that exploits observation-equivalencies which we call lifted probabilistic inference. To compare this approach with conventional inference approaches, we adapted an office scenario from the literature so that it parametrizes observation-equivalent entities and simulated a corresponding dataset. This dataset can be used as a benchmark for the evaluation of different inference approaches with respect to observation-equivalence. We compare the number of states this approach, and a conventional inference algorithm is considering during inference on this benchmark dataset. On average, the conventional approach uses almost 200,000 states to cover the situations of the scenario during the inference whereas our lifted probabilistic inference approach uses less than 100 states. Thus, an observation-equivalent approach seems promising for a more efficient inference in scenarios with many observation-equivalent entities.
Initial State Prediction in Planning
Krivic, Senka (University of Innsbruck) | Cashmore, Michael (King's College London) | Ridder, Bram (King's College London) | Magazzeni, Daniele (King's College London) | Szedmak, Sandor (Aalto University) | Piater, Justus (University of Innsbruck)
While recent advances in offline reasoning techniques and online execution strategies have made planning under uncertainty more robust, the application of plans in partially-known environments is still a difficult and important topic. In this paper we present an approach for predicting new information about a partially-known initial state, represented as a multigraph utilizing Maximum-Margin Multi-Valued Regression. We evaluate this approach in four different domains, demonstrating high recall and accuracy.
What Does That ?-Block Do? Learning Latent Causal Affordances From Mario Play Traces
Summerville, Adam (University of California, Santa Cruz) | Behrooz, Morteza (University of California, Santa Cruz) | Mateas, Michael (University of California, Santa Cruz) | Jhala, Arnav (North Carolina State University)
Procedural content generation (PCG) for videogames relies on a commitment to the semantics of the game. Concepts such as enemies or solidity are required for the creation of levels for platformer games. As humans, we can instantly identify the underlying semantics of a game from brief snippets of game play video or from playing the game. Previous PCG systems have needed humans to identify the semantic properties of objects in the game, either implicitly or explicitly. We propose a system that can automatically learn the semantic properties of game objects by observation of events in the game via a causal learning framework. We apply this learning approach to play traces from the Super Mario Bros. series.