South America
Regularizing Flat Latent Variables with Hierarchical Structures
Lin, Rongcheng (University of North Carolina at Charlotte) | Li, Huayu (University of North Carolina at Charlotte) | Quan, Xiaojun (Institute for Infocomm Research) | Hong, Richang (Hefei University of Technology) | Wu, Zhiang (Nanjing University of Finance and Economics) | Ge, Yong (University of North Carolina at Charlotte)
In this paper, we propose a stratified topic model (STM). Instead of directly modeling and inferring flat topics or hierarchically structured topics, we use the stratified relationships in topic hierarchies to regularize the flat topics. The topic structures are captured by a hierarchical clustering method and play as constraints during the learning process. We propose two theoretically sound and practical inference methods to solve the model. Experimental results with two real world data sets and various evaluation metrics demonstrate the effectiveness of the proposed model.
A New Simplex Sparse Learning Model to Measure Data Similarity for Clustering
Huang, Jin (University of Texas at Arlington) | Nie, Feiping (University of Texas at Arlington) | Huang, Heng (University of Texas at Arlington)
The Laplacian matrix of a graph can be used in many areas of mathematical research and has a physical interpretation in various theories. However, there are a few open issues in the Laplacian graph construction: (i) Selecting the appropriate scale of analysis, (ii) Selecting the appropriate number of neighbors, (iii) Handling multiscale data, and, (iv) Dealing with noise and outliers. In this paper, we propose that the affinity between pairs of samples could be computed using sparse representation with proper constraints. This parameter free setting automatically produces the Laplacian graph, leads to significant reduction in computation cost and robustness to the outliers and noise. We further provide an efficient algorithm to solve the difficult optimization problem based on improvement of existing algorithms. To demonstrate our motivation, we conduct spectral clustering experiments with benchmark methods. Empirical experiments on 9 data sets demonstrate the effectiveness of our method.
Multi-Label Structure Learning with Ising Model Selection
Goncalves, Andre R. (University of Campinas) | Zuben, Fernando J. Von (University of Campinas) | Banerjee, Arindam (University of Minnesota, Twin Cities)
A common way of attacking multi-label classification problems is by splitting it into a set of binary classification problems, then solving each problem independently using traditional single-label methods. Nevertheless, by learning classifiers separately the information about the relationship between labels tends to be neglected. Built on recent advances in structure learning in Ising Markov Random Fields (I-MRF), we propose a multi-label classification algorithm that explicitly estimate and incorporate label dependence into the classifiers learning process by means of a sparse convex multi-task learning formulation.Extensive experiments considering several existing multi-label algorithms indicate that the proposed method, while conceptually simple, outperforms the contenders in several datasets and performance metrics. Besides that, the conditional dependence graph encoded in the I-MRF provides a useful information that can be used in a posterior investigation regarding the reasons behind the relationship between labels.
Multi-View Matrix Decomposition: A New Scheme for Exploring Discriminative Information
Deng, Cheng (Xidian University) | Lv, Zongting (Xidian University) | Liu, Wei (IBM T. J. Watson Research Center) | Huang, Junzhou (University of Texas at Arlington) | Tao, Dacheng (University of Technology, Sydney) | Gao, Xinbo (Xidian University)
Recent studies have demonstrated the advantages of fusing information from multiple views for various machine learning applications. However, most existing approaches assumed the shared component common to all views and ignored the private components of individual views, which thereby restricts the learning performance. In this paper, we propose a new multi-view, low-rank, and sparse matrix decomposition scheme to seamlessly integrate diverse yet complementary information stemming from multiple views. Unlike previous approaches, our approach decomposes an input data matrix concatenated from multiple views as the sum of low-rank, sparse, and noisy parts. Then a unified optimization framework is established, where the low-rankness and group-structured sparsity constraints are imposed to simultaneously capture the shared and private components in both instance and view levels. A proven optimization algorithm is developed to solve the optimization, yielding the learned augmented representation which is used as features for classification tasks. Extensive experiments conducted on six benchmark image datasets show that our approach enjoys superior performance over the state-of-the-art approaches.
An Expectation-Maximization Algorithm to Compute a Stochastic Factorization From Data
Barreto, Andre M. S. (National Laboratory for Scientific Computing (LNCC)) | Beirigo, Rafael L. (National Laboratory for Scientific Computing (LNCC)) | Pineau, Joelle (McGill University) | Precup, Doina (McGill University)
When a transition probability matrix is represented as the product of two stochastic matrices, swapping the factors of the multiplication yields another transition matrix that retains some fundamental characteristics of the original. Since the new matrix can be much smaller than its precursor, replacing the former for the latter can lead to significant savings in terms of computational effort. This strategy, dubbed the "stochastic-factorization trick," can be used to compute the stationary distribution of a Markov chain, to determine the fundamental matrix of an absorbing chain, and to compute a decision policy via dynamic programming or reinforcement learning. In this paper we show that the stochastic-factorization trick can also provide benefits in terms of the number of samples needed to estimate a transition matrix. We introduce a probabilistic interpretation of a stochastic factorization and build on the resulting model to develop an algorithm to compute the factorization directly from data. If the transition matrix can be well approximated by a low-order stochastic factorization, estimating its factors instead of the original matrix reduces significantly the number of parameters to be estimated. Thus, when compared to estimating the transition matrix directly via maximum likelihood, the proposed method is able to compute approximations of roughly the same quality using less data. We illustrate the effectiveness of the proposed algorithm by using it to help a reinforcement learning agent learn how to play the game of blackjack.
First-Order Disjunctive Logic Programming vs Normal Logic Programming
Zhou, Yi (University of Western Sydney)
In this paper, we study the expressive power of first-order disjunctive logic programming (DLP) and normal logic programming (NLP) under the stable model semantics. We show that, unlike the propositional case, first-order DLP is strictly more expressive than NLP. This result still holds even if auxiliary predicates are allowed, assuming that NP not equals to coNP. On the other side, we propose a partial translation from first-order DLP to NLP via unfolding and shifting, which suggests a sound yet incomplete approach to implement DLP via NLP solvers. We also identify some NLP definable subclasses, and conjecture to exactly capture NLP definability by unfolding and shifting.
Towards Fully Observable Non-Deterministic Planning as Assumption-based Automatic Synthesis
Sardina, Sebastian (RMIT University) | D' (Universidad de Buenos Aires) | Ippolito, Nicolas
Whereas previous work on non-deterministic planning has focused on characterizing (and computing) "loopy" but "closed" plans, we look here at the kind of environments that these plans are to be executed in. In particular, we provide a logical characterization of the standard "fairness'' assumption used, and show that strong cyclic plans are correct solution concepts for fair environments. We argue then that such logical characterization allows us to recast non-deterministic planning as a reactive synthesis task, and show that for a special case, recent efficient synthesis techniques can be applied.
Kernel Contraction and Base Dependence: Redundancy in the Base Resulting in Different Types of Dependence
Oveisi, Mehrdad (Simon Fraser University) | Delgrande, James P. (Simon Fraser University) | Popowich, Fred (Simon Fraser University) | Pelletier, Francis Jeffry (University of Alberta)
The AGM paradigm of belief change studies the dynamics of belief states in light of new information. Finding, or even approximating, dependent or relevant beliefs to a change is valuable because, for example, it can narrow the set of beliefs considered during belief change operations. Gärdenfors' preservation criterion (GPC) suggests that formulas independent of a belief change should remain intact. GPC allows to build dependence relations that are theoretically linked with belief change. Such dependence relations can in turn be used as a theoretical benchmark against which to evaluate other approximate dependence or relevance relations. There are already some studies, based on GPC, on the parallelism between belief change and dependence. One study offers a dependence relation parallel to AGM contraction for belief sets. Another study links base dependence relation to a more general belief base contraction, saturated kernel contraction. Here we offer yet a more general parallelism between kernel contraction and base dependence. At this level of generalization, different types of base dependence emerge. We prove that this differentiation of base dependence types is a result of possible redundancy in the base. This provides a theoretical means to distinguish between redundant and informative parts of a belief base.
On the Entailment Problem for a Logic of Typicality
Booth, Richard (Mahasarakham University) | Casini, Giovanni (University of Pretoria and University of Luxembourg) | Meyer, Thomas Andreas (University of Cape Town) | Varzinczak, Ivan José (Universidade Federal de Rio de Janeiro)
Propositional Typicality Logic (PTL) is a recently proposed logic, obtained by enriching classical propositional logic with a typicality operator. In spite of the non-monotonic features introduced by the semantics adopted for the typicality operator, the obvious Tarskian definition of entailment for PTL remains monotonic and is therefore not appropriate. We investigate different (semantic) versions of entailment for PTL, based on the notion of Rational Closure as defined by Lehmann and Magidor for KLM-style conditionals, and constructed using minimality. Our first important result is an impossibility theorem showing that a set of proposed postulates that at first all seem appropriate for a notion of entailment with regard to typicality cannot be satis- fied simultaneously. Closer inspection reveals that this result is best interpreted as an argument for advocating the development of more than one type of PTL entailment. In the spirit of this interpretation, we define two primary forms of entailment for PTL and discuss their advantages and disadvantages
Policies that Generalize: Solving Many Planning Problems with the Same Policy
Bonet, Blai (Universidad Simon Bolivar) | Geffner, Hector (ICREA and Universitat Pompeu Fabra)
We establish conditions under which memoryless policies and finite-state controllers that solve one partially observable non-deterministic problem (PONDP) generalize to other problems; namely, problems that have a similar structure and share the same action and observation space. This is relevant to generalized planning where plans that work for many problems are sought, and to transfer learning where knowledge gained in the solution of one problem is to be used on related problems. We use a logical setting where uncertainty is represented by sets of states and the goal is to be achieved with certainty. While this gives us crisp notions of solution policies and generalization, the account also applies to probabilistic PONDs, i.e., Goal POMDPs.