AITopics

We present a novel Bayesian model for semi-supervised paIt-of-speech tagging.

ambiguity class, machine learning, natural language, (20 more...)

Country: North America > United States (0.28)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.97)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.86)

The Infinite Gamma-Poisson Feature Model

Titsias, Michalis K.

We address the problem of factorial learning which associates a set of latent causes or features with the observed data. Factorial models usually assume that each feature has a single occurrence in a given data point. However, there are data such as images where latent features have multiple occurrences, e.g. a visual object class can have multiple instances shown in the same image. To deal with such cases, we present a probability model over non-negative integer valued matrices with possibly unbounded number of columns. This model can play the role of the prior in an nonparametric Bayesian learning scenario where both the latent features and the number of their occurrences are unknown. We use this prior together with a likelihood model for unsupervised learning from images using a Markov Chain Monte Carlo inference algorithm.

artificial intelligence, feature occurrence, machine learning, (19 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Sumer, Ozgur, Acar, Umut, Ihler, Alexander T., Mettu, Ramgopal R.

Efficient Bayesian Inference for Dynamically Changing Graphs

Motivated by stochastic systems in which observed evidence and conditional dependencies betweenstates of the network change over time, and certain quantities of interest (marginal distributions, likelihood estimates etc.) must be updated, we study the problem of adaptive inference in tree-structured Bayesian networks. We describe an algorithm for adaptive inference that handles a broad range of changes to the network and is able to maintain marginal distributions, MAP estimates, and data likelihoods in all expected logarithmic time. We give an implementation of our algorithm and provide experiments that show that the algorithm can yield up to two orders of magnitude speedups on answering queries and responding to dynamic changesover the sum-product algorithm.

artificial intelligence, cluster tree, machine learning, (18 more...)

Country: North America > United States > Massachusetts (0.28)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Sugiyama, Masashi, Nakajima, Shinichi, Kashima, Hisashi, Buenau, Paul V., Kawanabe, Motoaki

Direct Importance Estimation with Model Selection and Its Application to Covariate Shift Adaptation

When training and test samples follow different input distributions (i.e., the situation called \emph{covariate shift}), the maximum likelihood estimator is known to lose its consistency. For regaining consistency, the log-likelihood terms need to be weighted according to the \emph{importance} (i.e., the ratio of test and training input densities). Thus, accurately estimating the importance is one of the key tasks in covariate shift adaptation. A naive approach is to first estimate training and test input densities and then estimate the importance by the ratio of the density estimates. However, since density estimation is a hard problem, this approach tends to perform poorly especially in high dimensional cases. In this paper, we propose a direct importance estimation method that does not require the input density estimates. Our method is equipped with a natural model selection procedure so tuning parameters such as the kernel width can be objectively optimized. This is an advantage over a recently developed method of direct importance estimation. Simulations illustrate the usefulness of our approach.

artificial intelligence, bayesian inference, machine learning, (16 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.73)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Stocker, Alan A., Simoncelli, Eero P.

A Bayesian Model of Conditioned Perception

We propose an extended probabilistic model for human perception. We argue that in many circumstances, human observers simultaneously evaluate sensory evidence under different hypotheses regarding the underlying physical process that might have generated the sensory information. Within this context, inference can be optimal if the observer weighs each hypothesis according to the correct belief in that hypothesis. But if the observer commits to a particular hypothesis, the belief in that hypothesis is converted into subjective certainty, and subsequent perceptual behavior is suboptimal, conditioned only on the chosen hypothesis. We demonstrate that this framework can explain psychophysical data of a recently reported decision-estimation experiment. The model well accounts for the data, predicting the same estimation bias as a consequence of the preceding decision step. The power of the framework is that it has no free parameters except the degree of the observer's uncertainty about its internal sensory representation. All other parameters are defined by the particular experiment which allows us to make quantitative predictions of human perception to two modifications of the original experiment.

artificial intelligence, bayesian inference, machine learning, (17 more...)

Country: North America > United States (0.46)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.84)

Discriminative Log-Linear Grammars with Latent Variables

Petrov, Slav, Klein, Dan

We demonstrate that log-linear grammars with latent variables can be practically trained using discriminative methods. Central to efficient discriminative training is a hierarchical pruning procedure which allows feature expectations to be efficiently approximatedin a gradient-based procedure.

artificial intelligence, machine learning, natural language, (20 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Naish-guzman, Andrew, Holden, Sean

The Generalized FITC Approximation

We present an efficient generalization of the sparse pseudo-input Gaussian process (SPGP)model developed by Snelson and Ghahramani [1], applying it to binary classification problems.

approximation, artificial intelligence, machine learning, (20 more...)

Country: Europe > United Kingdom (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Mudigonda, Pawan, Kolmogorov, Vladimir, Torr, Philip

An Analysis of Convex Relaxations for MAP Estimation

The problem of obtaining the maximum a posteriori estimate of a general discrete randomfield (i.e. a random field defined using a finite and discrete set of labels) is known to be

artificial intelligence, machine learning, relaxation, (17 more...)

Country:

Asia (0.28)
North America > United States > California (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Lee, Honglak, Ekanadham, Chaitanya, Ng, Andrew Y.

Sparse deep belief net model for visual area V2

Motivated in part by the hierarchical organization of cortex, a number of algorithms have recently been proposed that try to learn hierarchical, or ``deep,'' structure from unlabeled data. While several authors have formally or informally compared their algorithms to computations performed in visual area V1 (and the cochlea), little attempt has been made thus far to evaluate these algorithms in terms of their fidelity for mimicking computations at deeper levels in the cortical hierarchy. This paper presents an unsupervised learning model that faithfully mimics certain properties of visual area V2. Specifically, we develop a sparse variant of the deep belief networks of Hinton et al. (2006). We learn two layers of nodes in the network, and demonstrate that the first layer, similar to prior work on sparse coding and ICA, results in localized, oriented, edge filters, similar to the Gabor functions known to model V1 cell receptive fields. Further, the second layer in our model encodes correlations of the first layer responses in the data. Specifically, it picks up both collinear (``contour'') features as well as corners and junctions. More interestingly, in a quantitative comparison, the encoding of these more complex ``corner'' features matches well with the results from the Ito & Komatsu's study of biological V2 responses. This suggests that our sparse variant of deep belief networks holds promise for modeling more higher-order features.

algorithm, artificial intelligence, machine learning, (18 more...)

Industry: Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Kulesza, Alex, Pereira, Fernando

Structured Learning with Approximate Inference

In many structured prediction problems, the highest-scoring labeling is hard to compute exactly, leading to the use of approximate inference methods. However, when inference is used in a learning algorithm, a good approximation of the score may not be sufficient. We show in particular that learning can fail even with an approximate inference method with rigorous approximation guarantees. There are two reasons for this. First, approximate methods can effectively reduce the expressivity ofan underlying model by making it impossible to choose parameters that reliably give good predictions. Second, approximations can respond to parameter changes in such a way that standard learning algorithms are misled. In contrast, we give two positive results in the form of learning bounds for the use of LPrelaxed inference in structured perceptron and empirical risk minimization settings. We argue that without understanding combinations of inference and learning, such as these, that are appropriately compatible, learning performance under approximate inference cannot be guaranteed.

artificial intelligence, inference, machine learning, (17 more...)

Country: North America > United States (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)