AITopics

1301.6324

Country: North America > United States > California > Orange County > Irvine (0.14)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Armagan, Artin, Dunson, David, Lee, Jaeyong

Generalized double Pareto shrinkage

arXiv.org Machine LearningJan-26-2013

We propose a generalized double Pareto prior for Bayesian shrinkage estimation and inferences in linear models. The prior can be obtained via a scale mixture of Laplace or normal distributions, forming a bridge between the Laplace and Normal-Jeffreys' priors. While it has a spike at zero like the Laplace density, it also has a Student's $t$-like tail behavior. Bayesian computation is straightforward via a simple Gibbs sampling algorithm. We investigate the properties of the maximum a posteriori estimator, as sparse estimation plays an important role in many problems, reveal connections with some well-established regularization procedures, and show some asymptotic results. The performance of the prior is tested through simulations and an application.

artificial intelligence, gdp, machine learning, (18 more...)

1104.0861

Country: North America > United States > California (0.28)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Blythe, Duncan A. J., Meinecke, Frank C., von Buenau, Paul, Mueller, Klaus-Robert

Explorative Data Analysis for Changes in Neural Activity

arXiv.org Machine LearningJan-25-2013

Neural recordings are nonstationary time series, i.e. their properties typically change over time. Identifying specific changes, e.g. those induced by a learning task, can shed light on the underlying neural processes. However, such changes of interest are often masked by strong unrelated changes, which can be of physiological origin or due to measurement artifacts. We propose a novel algorithm for disentangling such different causes of non-stationarity and in this manner enable better neurophysiological interpretation for a wider set of experimental paradigms. A key ingredient is the repeated application of Stationary Subspace Analysis (SSA) using different temporal scales. The usefulness of our explorative approach is demonstrated in simulations, theory and EEG experiments with 80 Brain-Computer-Interfacing (BCI) subjects.

artificial intelligence, machine learning, matrix, (18 more...)

1301.6027

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Platanios, Emmanouil A., Chatzis, Sotirios P.

Mixture Gaussian Process Conditional Heteroscedasticity

arXiv.org Machine LearningJan-25-2013

Generalized autoregressive conditional heteroscedasticity (GARCH) models have long been considered as one of the most successful families of approaches for volatility modeling in financial return series. In this paper, we propose an alternative approach based on methodologies widely used in the field of statistical machine learning. Specifically, we propose a novel nonparametric Bayesian mixture of Gaussian process regression models, each component of which models the noise variance process that contaminates the observed data as a separate latent Gaussian process driven by the observed data. This way, we essentially obtain a mixture Gaussian process conditional heteroscedasticity (MGPCH) model for volatility modeling in financial return series. We impose a nonparametric prior with power-law nature over the distribution of the model mixture components, namely the Pitman-Yor process prior, to allow for better capturing modeled data distributions with heavy tails and skewness. Finally, we provide a copula- based approach for obtaining a predictive posterior for the covariances over the asset returns modeled by means of a postulated MGPCH model. We evaluate the efficacy of our approach in a number of benchmark scenarios, and compare its performance to state-of-the-art methodologies.

artificial intelligence, machine learning, modeling & simulation, (15 more...)

1211.441

Country:

North America > United States (0.46)
Europe (0.46)

Genre: Research Report (1.00)

Industry: Banking & Finance > Trading (0.68)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Teófilo, Luís Filipe, Reis, Luis Paulo

Identifying Player\'s Strategies in No Limit Texas Hold\'em Poker through the Analysis of Individual Moves

arXiv.org Artificial IntelligenceJan-24-2013

The development of competitive artificial Poker playing agents has proven to be a challenge, because agents must deal with unreliable information and deception which make it essential to model the opponents in order to achieve good results. This paper presents a methodology to develop opponent modeling techniques for Poker agents. The approach is based on applying clustering algorithms to a Poker game database in order to identify player types based on their actions. First, common game moves were identified by clustering all players\' moves. Then, player types were defined by calculating the frequency with which the players perform each type of movement. With the given dataset, 7 different types of players were identified with each one having at least one tactic that characterizes him. The identification of player types may improve the overall performance of Poker agents, because it helps the agents to predict the opponent\'s moves, by associating each opponent to a distinct cluster.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

1301.5943

Country:

Europe (0.94)
North America > United States > Texas (0.42)
North America > Canada > Alberta (0.29)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games > Poker (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining (0.92)
Information Technology > Artificial Intelligence > Games > Poker (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.52)

Azoury, Katy S., Warmuth, Manfred K.

Relative Loss Bounds for On-line Density Estimation with the Exponential Family of Distributions

We consider on-line density estimation with a parameterized density from the exponential family. The on-line algorithm receives one example at a time and maintains a parameter that is essentially an average of the past examples. After receiving an example the algorithm incurs a loss which is the negative log-likelihood of the example w.r.t. the past parameter of the algorithm. An off-line algorithm can choose the best parameter based on all the examples. We prove bounds on the additional total loss of the on-line algorithm over the total loss of the off-line algorithm. These relative loss bounds hold for an arbitrary sequence of examples. The goal is to design algorithms with the best possible relative loss bounds. We use a certain divergence to derive and analyze the algorithms. This divergence is a relative entropy between two exponential distributions.

algorithm, artificial intelligence, machine learning, (15 more...)

1301.6677

Country: North America > United States > California (0.68)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.63)

Kontkanen, Petri, Myllymaki, Petri, Silander, Tomi, Tirri, Henry

On Supervised Selection of Bayesian Networks

Given a set of possible models (e.g., Bayesian network structures) and a data sample, in the unsupervised model selection problem the task is to choose the most accurate model with respect to the domain joint probability distribution. In contrast to this, in supervised model selection it is a priori known that the chosen model will be used in the future for prediction tasks involving more "focused" predictive distributions. Although focused predictive distributions can be produced from the joint probability distribution by marginalization, in practice the best model in the unsupervised sense does not necessarily perform well in supervised domains. In particular, the standard marginal likelihood score is a criterion for the unsupervised task, and, although frequently used for supervised model selection also, does not perform well in such tasks. In this paper we study the performance of the marginal likelihood score empirically in supervised Bayesian network selection tasks by using a large number of publicly available classification data sets, and compare the results to those obtained by alternative model selection criteria, including empirical crossvalidation methods, an approximation of a supervised marginal likelihood measure, and a supervised version of Dawid's prequential (predictive sequential) principle. The results demonstrate that the marginal likelihood score does not perform well for supervised model selection, while the best results are obtained by using Dawid's prequential approach.

artificial intelligence, bayesian inference, machine learning, (14 more...)

1301.671

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Kwok, Tsz Chiu, Lau, Lap Chi, Lee, Yin Tat, Gharan, Shayan Oveis, Trevisan, Luca

Improved Cheeger's Inequality: Analysis of Spectral Partitioning Algorithms through Higher Order Spectral Gap

Let \phi(G) be the minimum conductance of an undirected graph G, and let 0=\lambda_1 <= \lambda_2 <=... <= \lambda_n <= 2 be the eigenvalues of the normalized Laplacian matrix of G. We prove that for any graph G and any k >= 2, \phi(G) = O(k) \lambda_2 / \sqrt{\lambda_k}, and this performance guarantee is achieved by the spectral partitioning algorithm. This improves Cheeger's inequality, and the bound is optimal up to a constant factor for any k. Our result shows that the spectral partitioning algorithm is a constant factor approximation algorithm for finding a sparse cut if \lambda_k$ is a constant for some constant k. This provides some theoretical justification to its empirical performance in image segmentation and clustering problems. We extend the analysis to other graph partitioning problems, including multi-way partition, balanced separator, and maximum cut.

algorithm, artificial intelligence, machine learning, (19 more...)

1301.5584

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)

Probabilistic Latent Semantic Analysis

Hofmann, Thomas

Probabilistic Latent Semantic Analysis is a novel statistical technique for the analysis of two-mode and co-occurrence data, which has applications in information retrieval and filtering, natural language processing, machine learning from text, and in related areas. Compared to standard Latent Semantic Analysis which stems from linear algebra and performs a Singular Value Decomposition of co-occurrence tables, the proposed method is based on a mixture decomposition derived from a latent class model. This results. in a more principled approach which has a solid foundation in statistics. In order to avoid overfitting, we propose a widely applicable generalization of maximum likelihood model fitting by tempered EM.

artificial intelligence, machine learning, natural language, (19 more...)

1301.6705

Country: North America > United States > California (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.89)

Inferring Parameters and Structure of Latent Variable Models by Variational Bayes

Attias, Hagai

Current methods for learning graphical models with latent variables and a fixed structure estimate optimal values for the model parameters. Whereas this approach usually produces overfitting and suboptimal generalization performance, carrying out the Bayesian program of computing the full posterior distributions over the parameters remains a difficult problem. Moreover, learning the structure of models with latent variables, for which the Bayesian approach is crucial, is yet a harder problem. In this paper I present the Variational Bayes framework, which provides a solution to these problems. This approach approximates full posterior distributions over model parameters and structures, as well as latent variables, in an analytical manner without resorting to sampling methods. Unlike in the Laplace approximation, these posteriors are generally non-Gaussian and no Hessian needs to be computed. The resulting algorithm generalizes the standard Expectation Maximization algorithm, and its convergence is guaranteed. I demonstrate that this algorithm can be applied to a large class of models in several domains, including unsupervised clustering and blind source separation.

artificial intelligence, machine learning, posterior, (20 more...)

1301.6676

Country: Europe (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)