Goto

Collaborating Authors

 Genre


Understanding Boltzmann Machine and Deep Learning via A Confident Information First Principle

arXiv.org Machine Learning

Typical dimensionality reduction methods focus on directly reducing the number of random variables while retaining maximal variations in the data. In this paper, we consider the dimensionality reduction in parameter spaces of binary multivariate distributions. We propose a general Confident-Information-First (CIF) principle to maximally preserve parameters with confident estimates and rule out unreliable or noisy parameters. Formally, the confidence of a parameter can be assessed by its Fisher information, which establishes a connection with the inverse variance of any unbiased estimate for the parameter via the Cram\'{e}r-Rao bound. We then revisit Boltzmann machines (BM) and theoretically show that both single-layer BM without hidden units (SBM) and restricted BM (RBM) can be solidly derived using the CIF principle. This can not only help us uncover and formalize the essential parts of the target density that SBM and RBM capture, but also suggest that the deep neural network consisting of several layers of RBM can be seen as the layer-wise application of CIF. Guided by the theoretical analysis, we develop a sample-specific CIF-based contrastive divergence (CD-CIF) algorithm for SBM and a CIF-based iterative projection procedure (IP) for RBM. Both CD-CIF and IP are studied in a series of density estimation experiments.


Learning-Based Procedural Content Generation

arXiv.org Artificial Intelligence

Procedural content generation (PCG) has recently become one of the hottest topics in computational intelligence and AI game researches. Among a variety of PCG techniques, search-based approaches overwhelmingly dominate PCG development at present. While SBPCG leads to promising results and successful applications, it poses a number of challenges ranging from representation to evaluation of the content being generated. In this paper, we present an alternative yet generic PCG framework, named learning-based procedure content generation (LBPCG), to provide potential solutions to several challenging problems in existing PCG techniques. By exploring and exploiting information gained in game development and public beta test via data-driven learning, our framework can generate robust content adaptable to end-user or target players on-line with minimal interruption to their experience. Furthermore, we develop enabling techniques to implement the various models required in our framework. For a proof of concept, we have developed a prototype based on the classic open source first-person shooter game, Quake. Simulation results suggest that our framework is promising in generating quality content.


Towards common-sense reasoning via conditional simulation: legacies of Turing in Artificial Intelligence

arXiv.org Artificial Intelligence

The problem of replicating the flexibility of human common-sense reasoning has captured the imagination of computer scientists since the early days of Alan Turing's foundational work on computation and the philosophy of artificial intelligence. In the intervening years, the idea of cognition as computation has emerged as a fundamental tenet of Artificial Intelligence (AI) and cognitive science. But what kind of computation is cognition? We describe a computational formalism centered around a probabilistic Turing machine called QUERY, which captures the operation of probabilistic conditioning via conditional simulation. Through several examples and analyses, we demonstrate how the QUERY abstraction can be used to cast common-sense reasoning as probabilistic inference in a statistical model of our observations and the uncertain structure of the world that generated that experience. This formulation is a recent synthesis of several research programs in AI and cognitive science, but it also represents a surprising convergence of several of Turing's pioneering insights in AI, the foundations of computation, and statistics.


The Generalized Traveling Salesman Problem solved with Ant Algorithms

arXiv.org Artificial Intelligence

A well known N P-hard problem called the Generalized Traveling Salesman Problem (GTSP) is considered. In GTSP the nodes of a complete undirected graph are partitioned into clusters. The objective is to find a minimum cost tour passing through exactly one node from each cluster. An exact exponential time algorithm and an effective meta-heuristic algorithm for the problem are presented. The meta-heuristic proposed is a modified Ant Colony System (ACS) algorithm called Reinforcing Ant Colony System (RACS) which introduces new correction rules in the ACS algorithm. Computational results are reported for many standard test problems. The proposed algorithm is competitive with the other already proposed heuristics for the GTSP in both solution quality and computational time.


Empowerment -- an Introduction

arXiv.org Artificial Intelligence

This book chapter is an introduction to and an overview of the information-theoretic, task independent utility function "Empowerment", which is defined as the channel capacity between an agent's actions and an agent's sensors. It quantifies how much influence and control an agent has over the world it can perceive. This book chapter discusses the general idea behind empowerment as an intrinsic motivation and showcases several previous applications of empowerment to demonstrate how empowerment can be applied to different sensor-motor configuration, and how the same formalism can lead to different observed behaviors. Furthermore, we also present a fast approximation for empowerment in the continuous domain.


Distributed Coordinate Descent Method for Learning with Big Data

arXiv.org Machine Learning

In this paper we develop and analyze Hydra: HYbriD cooRdinAte descent method for solving loss minimization problems with big data. We initially partition the coordinates (features) and assign each partition to a different node of a cluster. At every iteration, each node picks a random subset of the coordinates from those it owns, independently from the other computers, and in parallel computes and applies updates to the selected coordinates based on a simple closed-form formula. We give bounds on the number of iterations sufficient to approximately solve the problem with high probability, and show how it depends on the data and on the partitioning. We perform numerical experiments with a LASSO instance described by a 3TB matrix.


Accelerated Proximal Stochastic Dual Coordinate Ascent for Regularized Loss Minimization

arXiv.org Machine Learning

We introduce a proximal version of the stochastic dual coordinate ascent method and show how to accelerate the method using an inner-outer iteration procedure. We analyze the runtime of the framework and obtain rates that improve state-of-the-art results for various key machine learning optimization problems including SVM, logistic regression, ridge regression, Lasso, and multiclass SVM. Experiments validate our theoretical findings.


A short note on the axiomatic requirements of uncertainty measure

arXiv.org Artificial Intelligence

In this note, we argue that the axiomatic requirement of range to the measure of aggregated total uncertainty (ATU) in Dempster-Shafer theory is not reasonable. Keywords: Dempster-Shafer theory, Uncertainty measure Dempster-Shafer theory [1, 2] is widely applied to uncertainty modeling [3, 4]. Two types of uncertainty, namely nonspecificity and discord, are coexisting in the Dempster-Shafer theory [5, 6]. A justifiable measure to these uncertainty is necessary to describe the essential characters of basic probability assignment function(BPA). To be justifiable, for a measure called as aggregated total uncertainty (ATU), some requirements are necessary.


Double four-bar crank-slider mechanism dynamic balancing by meta-heuristic algorithms

arXiv.org Artificial Intelligence

In this paper, a new method for dynamic balancing of double four-bar crank slider mechanism by meta- heuristic-based optimization algorithms is proposed. For this purpose, a proper objective function which is necessary for balancing of this mechanism and corresponding constraints has been obtained by dynamic modeling of the mechanism. Then PSO, ABC, BGA and HGAPSO algorithms have been applied for minimizing the defined cost function in optimization step. The optimization results have been studied completely by extracting the cost function, fitness, convergence speed and runtime values of applied algorithms. It has been shown that PSO and ABC are more efficient than BGA and HGAPSO in terms of convergence speed and result quality. Also, a laboratory scale experimental doublefour-bar crank-slider mechanism was provided for validating the proposed balancing method practically.


Discriminative Features via Generalized Eigenvectors

arXiv.org Machine Learning

Representing examples in a way that is compatible with the underlying classifier can greatly enhance the performance of a learning system. In this paper we investigate scalable techniques for inducing discriminative features by taking advantage of simple second order structure in the data. We focus on multiclass classification and show that features extracted from the generalized eigenvectors of the class conditional second moments lead to classifiers with excellent empirical performance. Moreover, these features have attractive theoretical properties, such as inducing representations that are invariant to linear transformations of the input. We evaluate classifiers built from these features on three different tasks, obtaining state of the art results.