AITopics

1210.2085

Country: North America > United States > California (0.27)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

arXiv.org Machine LearningOct-9-2013

Improved Bayesian Logistic Supervised Topic Models with Data Augmentation

Zhu, Jun, Zheng, Xun, Zhang, Bo

Supervised topic models with a logistic likelihood have two issues that potentially limit their practical use: 1) response variables are usually over-weighted by document word counts; and 2) existing variational inference methods make strict mean-field assumptions. We address these issues by: 1) introducing a regularization constant to better balance the two parts based on an optimization formulation of Bayesian inference; and 2) developing a simple Gibbs sampling algorithm by introducing auxiliary Polya-Gamma variables and collapsing out Dirichlet variables. Our augment-and-collapse sampling algorithm has analytical forms of each conditional distribution without making any restricting assumptions and can be easily parallelized. Empirical results demonstrate significant improvements on prediction performance and time efficiency.

artificial intelligence, machine learning, topic model, (18 more...)

1310.2408

Country: Asia (0.28)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.91)

arXiv.org Machine LearningOct-9-2013

Understanding Boltzmann Machine and Deep Learning via A Confident Information First Principle

Zhao, Xiaozhao, Hou, Yuexian, Yu, Qian, Song, Dawei, Li, Wenjie

Typical dimensionality reduction methods focus on directly reducing the number of random variables while retaining maximal variations in the data. In this paper, we consider the dimensionality reduction in parameter spaces of binary multivariate distributions. We propose a general Confident-Information-First (CIF) principle to maximally preserve parameters with confident estimates and rule out unreliable or noisy parameters. Formally, the confidence of a parameter can be assessed by its Fisher information, which establishes a connection with the inverse variance of any unbiased estimate for the parameter via the Cram\'{e}r-Rao bound. We then revisit Boltzmann machines (BM) and theoretically show that both single-layer BM without hidden units (SBM) and restricted BM (RBM) can be solidly derived using the CIF principle. This can not only help us uncover and formalize the essential parts of the target density that SBM and RBM capture, but also suggest that the deep neural network consisting of several layers of RBM can be seen as the layer-wise application of CIF. Guided by the theoretical analysis, we develop a sample-specific CIF-based contrastive divergence (CD-CIF) algorithm for SBM and a CIF-based iterative projection procedure (IP) for RBM. Both CD-CIF and IP are studied in a series of density estimation experiments.

artificial intelligence, information, machine learning, (18 more...)

1302.3931

Country:

North America (0.67)
Asia > China (0.46)
Europe > United Kingdom > England (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.85)

Roberts, Jonathan, Chen, Ke

Learning-Based Procedural Content Generation

arXiv.org Artificial IntelligenceOct-9-2013

Procedural content generation (PCG) has recently become one of the hottest topics in computational intelligence and AI game researches. Among a variety of PCG techniques, search-based approaches overwhelmingly dominate PCG development at present. While SBPCG leads to promising results and successful applications, it poses a number of challenges ranging from representation to evaluation of the content being generated. In this paper, we present an alternative yet generic PCG framework, named learning-based procedure content generation (LBPCG), to provide potential solutions to several challenging problems in existing PCG techniques. By exploring and exploiting information gained in game development and public beta test via data-driven learning, our framework can generate robust content adaptable to end-user or target players on-line with minimal interruption to their experience. Furthermore, we develop enabling techniques to implement the various models required in our framework. For a proof of concept, we have developed a prototype based on the classic open source first-person shooter game, Quake. Simulation results suggest that our framework is promising in generating quality content.

artificial intelligence, machine learning, target player, (17 more...)

arXiv.org Artificial Intelligence

1308.6415

Country: Europe > United Kingdom (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.66)

Richtárik, Peter, Takáč, Martin

Distributed Coordinate Descent Method for Learning with Big Data

arXiv.org Machine LearningOct-8-2013

In this paper we develop and analyze Hydra: HYbriD cooRdinAte descent method for solving loss minimization problems with big data. We initially partition the coordinates (features) and assign each partition to a different node of a cluster. At every iteration, each node picks a random subset of the coordinates from those it owns, independently from the other computers, and in parallel computes and applies updates to the selected coordinates based on a simple closed-form formula. We give bounds on the number of iterations sufficient to approximately solve the problem with high probability, and show how it depends on the data and on the partitioning. We perform numerical experiments with a LASSO instance described by a 3TB matrix.

artificial intelligence, data mining, machine learning, (17 more...)

1310.2059

Country: Europe > United Kingdom (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Shalev-Shwartz, Shai, Zhang, Tong

Accelerated Proximal Stochastic Dual Coordinate Ascent for Regularized Loss Minimization

arXiv.org Machine LearningOct-8-2013

We introduce a proximal version of the stochastic dual coordinate ascent method and show how to accelerate the method using an inner-outer iteration procedure. We analyze the runtime of the framework and obtain rates that improve state-of-the-art results for various key machine learning optimization problems including SVM, logistic regression, ridge regression, Lasso, and multiclass SVM. Experiments validate our theoretical findings.

artificial intelligence, machine learning, runtime, (18 more...)

1309.2375

Country: Asia > Middle East > Israel (0.28)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Karampatziakis, Nikos, Mineiro, Paul

Discriminative Features via Generalized Eigenvectors

Representing examples in a way that is compatible with the underlying classifier can greatly enhance the performance of a learning system. In this paper we investigate scalable techniques for inducing discriminative features by taking advantage of simple second order structure in the data. We focus on multiclass classification and show that features extracted from the generalized eigenvectors of the class conditional second moments lead to classifiers with excellent empirical performance. Moreover, these features have attractive theoretical properties, such as inducing representations that are invariant to linear transformations of the input. We evaluate classifiers built from these features on three different tasks, obtaining state of the art results.

eigenvector, machine learning, natural language, (17 more...)

1310.1934

Country: North America > United States (0.46)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Generalized Negative Binomial Processes and the Representation of Cluster Structures

Zhou, Mingyuan

The paper introduces the concept of a cluster structure to define a joint distribution of the sample size and its exchangeable random partitions. The cluster structure allows the probability distribution of the random partitions of a subset of the sample to be dependent on the sample size, a feature not presented in a partition structure. A generalized negative binomial process count-mixture model is proposed to generate a cluster structure, where in the prior the number of clusters is finite and Poisson distributed and the cluster sizes follow a truncated negative binomial distribution. The number and sizes of clusters can be controlled to exhibit distinct asymptotic behaviors. Unique model properties are illustrated with example clustering results using a generalized Polya urn sampling scheme. The paper provides new methods to generate exchangeable random partitions and to control both the cluster-number and cluster-size distributions.

artificial intelligence, machine learning, reparameterized gnbp, (17 more...)

1310.18

Country: North America > United States > Texas (0.28)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.36)

Gillis, Nicolas, Vavasis, Stephen A.

Fast and Robust Recursive Algorithms for Separable Nonnegative Matrix Factorization

A hyperspectral image consists of a set of images taken at different wavelengths. It is acquired by measuring the spectral signature of each pixel present in the scene, that is, by measuring the reflectance (the fraction of the incident electromagnetic power that is reflected by a surface at a given wavelength) of each pixel at different wavelengths. One of the most important tasks in hyperspectral imaging is called unmixing. It requires the identification of the constitutive materials present in the image and estimation of their abundances in each pixel. The most widely used model is the linear mixing model: the spectral signature of each pixel results from the additive linear combination of the spectral signatures of the constitutive materials, called endmembers, where the weights of the linear combination correspond to the abundances of the different endmembers in that pixel.

artificial intelligence, data mining, machine learning, (16 more...)

doi: 10.1109/TPAMI.2013.226

1208.1237

Country:

North America > Canada (0.28)
North America > United States (0.28)

Genre:

Research Report (0.50)
Workflow (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Riihimäki, Jaakko, Vehtari, Aki

Laplace approximation for logistic Gaussian process density estimation and regression

Logistic Gaussian process (LGP) priors provide a flexible alternative for modelling unknown densities. The smoothness properties of the density estimates can be controlled through the prior covariance structure of the LGP, but the challenge is the analytically intractable inference. In this paper, we present approximate Bayesian inference for LGP density estimation in a grid using Laplace's method to integrate over the non-Gaussian posterior distribution of latent function values and to determine the covariance function parameters with type-II maximum a posteriori (MAP) estimation. We demonstrate that Laplace's method with MAP is sufficiently fast for practical interactive visualisation of 1D and 2D densities. Our experiments with simulated and real 1D data sets show that the estimation accuracy is close to a Markov chain Monte Carlo approximation and state-of-the-art hierarchical infinite Gaussian mixture models. We also construct a reduced-rank approximation to speed up the computations for dense 2D grids, and demonstrate density regression with the proposed Laplace approach.

approximation, artificial intelligence, machine learning, (16 more...)

1211.0174

Country: Europe (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)