Asia
Learning High-Density Regions for a Generalized Kolmogorov-Smirnov Test in High-Dimensional Data
Glazer, Assaf, Lindenbaum, Michael, Markovitch, Shaul
We propose an efficient, generalized, nonparametric, statistical Kolmogorov-Smirnov test for detecting distributional change in high-dimensional data. To implement the test, we introduce a novel, hierarchical, minimum-volume sets estimator to represent the distributions to be tested. Our work is motivated by the need to detect changes in data streams, and the test is especially efficient in this context. We provide the theoretical foundations of our test and show its superiority over existing methods.
Decision Making in Complex Multiagent Contexts: A Tale of Two Frameworks
Doshi, Prashant J. (University of Georgia)
Decision making is a key feature of autonomous systems. It involves choosing optimally between different lines of action in various information contexts that range from perfectly knowing all aspects of the decision problem to having just partial knowledge about it. The physical context often includes other interacting autonomous systems, typically called agents. In this article, I focus on decision making in a multiagent context with partial information about the problem. Relevant research in this complex but realistic setting has converged around two complementary, general frameworks and also introduced myriad specializations on its way. I put the two frameworks, decentralized partially observable Markov decision process (Dec-POMDP) and the interactive partially observable Markov decision process (I-POMDP), in context and review the foundational algorithms for these frameworks, while briefly discussing the advances in their specializations. I conclude by examining the avenues that research pertaining to these frameworks is pursuing.
The Answer Set Programming Competition
Calimeri, Francesco (Universita') | Ianni, Giovambattista (della Calabria) | Krennwallner, Thomas (Universita') | Ricca, Francesco (della Calabria)
The Answer Set Programming (ASP) Competition is a biannual event for evaluating declarative knowledge representation systems on hard and demanding AI problems. The competition consists of two main tracks: the ASP system track and the model and solve track. The traditional system track compares dedicated answer set solvers on ASP benchmarks, while the model and solve track invites any researcher and developer of declarative knowledge representation systems to participate in an open challenge for solving sophisticated AI problems with their tools of choice. This article provides an overview of the ASP competition series, reviews its origins and history, giving insights on organizing and running such an elaborate event, and briefly discusses about the lessons learned so far.
Tractable Set Constraints
Many fundamental problems in artificial intelligence, knowledge representation, and verification involve reasoning about sets and relations between sets and can be modeled as set constraint satisfaction problems (set CSPs). Such problems are frequently intractable, but there are several important set CSPs that are known to be polynomial-time tractable. We introduce a large class of set CSPs that can be solved in quadratic time. Our class, which we call EI, contains all previously known tractable set CSPs, but also some new ones that are of crucial importance for example in description logics. The class of EI set constraints has an elegant universal-algebraic characterization, which we use to show that every set constraint language that properly contains all EI set constraints already has a finite sublanguage with an NP-hard constraint satisfaction problem.
Evaluating Indirect Strategies for Chinese-Spanish Statistical Machine Translation
Costa-jussà, M. R., Henríquez, C. A., Banchs, R. E.
Although, Chinese and Spanish are two of the most spoken languages in the world, not much research has been done in machine translation for this language pair. This paper focuses on investigating the state-of-the-art of Chinese-to-Spanish statistical machine translation (Smt), which nowadays is one of the most popular approaches to machine translation. For this purpose, we report details of the available parallel corpus which are Basic Traveller Expressions Corpus (Btec), Holy Bible and United Nations (Un). Additionally, we conduct experimental work with the largest of these three corpora to explore alternative Smt strategies by means of using a pivot language. Three alternatives are considered for pivoting: cascading, pseudo-corpus and triangulation. As pivot language, we use either English, Arabic or French. Results show that, for a phrase-based Smt system, English is the best pivot language between Chinese and Spanish. We propose a system output combination using the pivot strategies which is capable of outperforming the direct translation strategy. The main objective of this work is motivating and involving the research community to work in this important pair of languages given their demographic impact.
Density Propagation and Improved Bounds on the Partition Function
Ermon, Stefano, Sabharwal, Ashish, Selman, Bart, Gomes, Carla P.
Given a probabilistic graphical model, its density of states is a distribution that, for any likelihood value, gives the number of configurations with that probability. Weintroduce a novel message-passing algorithm called Density Propagation (DP) for estimating this distribution. We show that DP is exact for tree-structured graphical models and is, in general, a strict generalization of both sum-product and max-product algorithms. Further, we use density of states and tree decomposition to introduce a new family of upper and lower bounds on the partition function. For any tree decomposition, the new upper bound based on finer-grained density of state information is provably at least as tight as previously known bounds based on convexity of the log-partition function, and strictly stronger if a general condition holds.We conclude with empirical evidence of improvement over convex relaxations and mean-field based bounds.
Discriminative Learning of Sum-Product Networks
Sum-product networks are a new deep architecture that can perform fast, exact inference on high-treewidth models. Only generative methods for training SPNs have been proposed to date. In this paper, we present the first discriminative training algorithms for SPNs, combining the high accuracy of the former with the representational power and tractability of the latter. We show that the class of tractable discriminative SPNs is broader than the class of tractable generative ones, and propose an efficient backpropagation-style algorithm for computing the gradient of the conditional log likelihood. Standard gradient descent suffers from the diffusion problem, but networks with many layers can be learned reliably using "hard" gradient descent, where marginal inference is replaced by MPE inference (i.e., inferring the most probable state of the non-evidence variables). The resulting updates have a simple and intuitive form. We test discriminative SPNs on standard image classification tasks. We obtain the best results to date on the CIFAR-10 dataset, using fewer features than prior methods with an SPN architecture that learns local image structure discriminatively. We also report the highest published test accuracy on STL-10 even though we only use the labeled portion of the dataset.
Risk Aversion in Markov Decision Processes via Near Optimal Chernoff Bounds
Moldovan, Teodor M., Abbeel, Pieter
The expected return is a widely used objective in decision making under uncertainty. Many algorithms, such as value iteration, have been proposed to optimize it. In risk-aware settings, however, the expected return is often not an appropriate objective to optimize. We propose a new optimization objective for risk-aware planning and show that it has desirable theoretical properties. We also draw connections to previously proposed objectives for risk-aware planing: minmax, exponential utility, percentile and mean minus variance. Our method applies to an extended class of Markov decision processes: we allow costs to be stochastic as long as they are bounded. Additionally, we present an efficient algorithm for optimizing the proposed objective. Synthetic and real-world experiments illustrate the effectiveness of our method, at scale.
Submodular-Bregman and the Lovász-Bregman Divergences with Applications
Iyer, Rishabh, Bilmes, Jeff A.
We introduce a class of discrete divergences on sets (equivalently binary vectors) that we call the submodular-Bregman divergences. We consider two kinds, defined either from tight modular upper or tight modular lower bounds of a submodular function. We show that the properties of these divergences are analogous to the (standard continuous) Bregman divergence. We demonstrate how they generalize many useful divergences, including the weighted Hamming distance, squared weighted Hamming, weighted precision, recall, conditional mutual information, and a generalized KL-divergence on sets. We also show that the generalized Bregman divergence on the Lovász extension of a submodular function, which we call the Lovász-Bregman divergence, is a continuous extension of a submodular Bregman divergence. We point out a number of applications, and in particular show that a proximal algorithm defined through the submodular Bregman divergence provides a framework for many mirror-descent style algorithms related to submodular function optimization. We also show that a generalization of the k-means algorithm using the Lovász Bregman divergence is natural in clustering scenarios where ordering is important. A unique property of this algorithm is that computing the mean ordering is extremely efficient unlike other order based distance measures.
Why MCA? Nonlinear sparse coding with spike-and-slab prior for neurally plausible image encoding
Sterne, Philip, Bornschein, Joerg, Sheikh, Abdul-saboor, Lücke, Jörg, Shelton, Jacquelyn A.
Modelling natural images with sparse coding (SC) has faced two main challenges: flexibly representing varying pixel intensities and realistically representing lowlevel image components. This paper proposes a novel multiple-cause generative model of low-level image statistics that generalizes the standard SC model in two crucial points: (1) it uses a spike-and-slab prior distribution for a more realistic representation of component absence/intensity, and (2) the model uses the highly nonlinear combination rule of maximal causes analysis (MCA) instead of a linear combination. The major challenge is parameter optimization because a model with either (1) or (2) results in strongly multimodal posteriors. We show for the first time that a model combining both improvements can be trained efficiently while retaining the rich structure of the posteriors. We design an exact piecewise Gibbs sampling method and combine this with a variational method based on preselection of latent dimensions. This combined training scheme tackles both analytical and computational intractability and enables application of the model to a large number of observed and hidden dimensions.