AITopics

PCA can be smarter and makes more sensible projections. In this paper, we propose smart PCA, an extension to standard PCA to regularize and incorporate external knowledge into model estimation. Based on the probabilistic interpretation of PCA, the inverse Wishart distribution can be used as the informative conjugate prior for the population covariance, and useful knowledge is carried by the prior hyperparameters. We design the hyperparameters to smoothly combine the information from both the domain knowledge and the data itself. The Bayesian point estimation of principal components is in closed form. In empirical studies, smart PCA shows clear improvement on three different criteria: image reconstruction errors, the perceptual quality of the reconstructed images, and the pattern recognition performance.

pca, principal component, smart pca, (13 more...)

Twenty-First International Joint Conference on Artificial Intelligence

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Multi-Relational Learning with Gaussian Processes

Xu, Zhao (Fraunhofer IAIS) | Kersting, Kristian (Fraunhofer IAIS) | Tresp, Volker (Siemens Corporate Technology)

Due to their flexible nonparametric nature, Gaussian process models are very effective at solving hard machine learning problems. While existing Gaussian process models focus on modeling one single relation, we present a generalized GP model, named multi-relational Gaussian process model, that is able to deal with an arbitrary number of relations in a domain of interest. The proposed model is analyzed in the context of bipartite, directed, and undirected univariate relations. Experimental results on real-world datasets show that exploiting the correlations among different entity types and relations can indeed improve prediction performance.

gaussian process, latent variable, relation, (14 more...)

Twenty-First International Joint Conference on Artificial Intelligence

Country: Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Industry: Media > Film (0.30)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Preference Learning with Extreme Examples

In this paper, we consider a general problem of semi-supervised preference learning, in which we assume that we have the information of the extreme cases and some ordered constraints, our goal is to learn the unknown preferences of the other places. Taking the potential housing place selection problem as an example, we have many candidate places together with their associated information (e.g., position, environment), and we know some extreme examples (i.e., several places are perfect for building a house, and several places are the worst that cannot build a house there), and we know some partially ordered constraints (i.e., for two places, which place is better), then how can we judge the preference of one potential place whose preference is unknown beforehand? We propose a Bayesian framework based on Gaussian process to tackle this problem, from which we not only solve for the unknown preferences, but also the hyperparameters contained in our model.

algorithm, ghahramani, information, (16 more...)

Twenty-First International Joint Conference on Artificial Intelligence

Country:

North America > United States > Florida > Miami-Dade County > Miami (0.04)
Asia > China > Beijing > Beijing (0.04)

Industry:

Health & Medicine (0.46)
Transportation > Ground (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Semi-Supervised Classification using Sparse Gaussian Process Regression

Patel, Amrish (Indian Institute of Science) | Sundararajan, S. (Yahoo! Labs) | Shevade, Shirish (Indian Institute of Science)

Gaussian Processes (GPs) are promising Bayesian methods for classification and regression problems. They have also been used for semi-supervised learning tasks. In this paper, we propose a new algorithm for solving semi-supervised binary classification problem using sparse GP regression (GPR) models. It is closely related to semi-supervised learning based on support vector regression (SVR) and maximum margin clustering. The proposed algorithm is simple and easy to implement. It gives a sparse solution directly unlike the SVR based algorithm. Also, the hyperparameters are estimated easily without resorting to expensive cross-validation technique. Use of sparse GPR model helps in making the proposed algorithm scalable. Preliminary results on synthetic and real-world data sets demonstrate the efficacy of the new algorithm.

classifier, dataset, experiment, (16 more...)

Twenty-First International Joint Conference on Artificial Intelligence

Country:

Asia > Middle East > Jordan (0.05)
North America > Canada > Quebec > Montreal (0.04)
North America > United States > New York > Monroe County > Rochester (0.04)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Chen, Yutian (University of California, Irvine) | Welling, Max (University of California, Irvine)

Bayesian Extreme Components Analysis

Extreme Components Analysis (XCA) is a statistical method based on a single eigenvalue decomposition to recover the optimal combination of principal and minor components in the data. Unfortunately, minor components are notoriously sensitive to overfitting when the number of data items is small relative to the number of attributes. We present a Bayesian extension of XCA by introducing a conjugate prior for the parameters of the XCA model. This Bayesian-XCA is shown to outperform plain vanilla XCA as well as Bayesian-PCA and XCA based on a frequentist correction to the sample spectrum. Moreover, we show that minor components are only picked when they represent genuine constraints in the data, even for very small sample sizes. An extension to mixtures of Bayesian XCA models is also explored.

bayesian xca, minor component, xca, (15 more...)

Twenty-First International Joint Conference on Artificial Intelligence

Country:

North America > United States > California > Orange County > Irvine (0.04)
North America > Canada > Ontario > Toronto (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report > Experimental Study (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Abreu, Rui (Delft University of Technology) | Zoeteweij, Peter (Delft University of Technology) | Gemund, Arjan J.C. van (Delft University of Technology)

A New Bayesian Approach to Multiple Intermittent Fault Diagnosis

Logic reasoning approaches to fault diagnosis account for the fact that a component c j may fail intermittently by introducing a parameter g j that expresses the probability the component exhibits correct behavior. This component parameter g j , in conjunction with a priori fault probability, is usedin a Bayesian framework to compute the posterior fault candidate probabilities. Usually, information on g j is not known a priori. While proper estimation of g j can have a great impact on the diagnostic accuracy, at present, only approximations have been proposed. We present a novel framework, BARINEL, that computes exact estimations of g j as integral part of the posterior candidate probability computation. BARINEL’s diagnostic performance is evaluated for both synthetic and real software systems. Our results show that our approach is superior to approaches based on classical persistent fault models as well as previously proposed intermittent fault models.

arinel, diagnosis, probability, (17 more...)

Twenty-First International Joint Conference on Artificial Intelligence

Country: Europe > Netherlands > South Holland > Delft (0.04)

Genre: Research Report > New Finding (0.86)

Industry: Health & Medicine > Diagnostic Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Preference Functions That Score Rankings and Maximum Likelihood Estimation

Conitzer, Vincent (Duke University) | Rognlie, Matthew (Duke University) | Xia, Lirong (Duke University)

In social choice, a preference function (PF) takes a set of votes (linear orders over a set of alternatives) as input, and produces one or more rankings (also linear orders over the alternatives) as output. Such functions have many applications, for example, aggregating the preferences of multiple agents, or merging rankings (of, say, webpages) into a single ranking. The key issue is choosing a PF to use. One natural and previously studied approach is to assume that there is an unobserved "correct" ranking, and the votes are noisy estimates of this. Then, we can use the PF that always chooses the maximum likelihood estimate (MLE) of the correct ranking. In this paper, we define simple ranking scoring functions (SRSFs) and show that the class of neutral SRSFs is exactly the class of neutral PFs that are MLEs for some noise model. We also define composite ranking scoring functions (CRSFs) and show a condition under which these coincide with SRSFs. We study key properties such as consistency and continuity, and consider some example PFs. In particular, we study Single Transferable Vote (STV), a commonly used PF, showing that it is a CRSF but not an SRSF, thereby clarifying the extent to which it is an MLE function. This also gives a new perspective on how ties should be broken under STV. We leave some open questions.

ranking, srsf, vote, (16 more...)

Twenty-First International Joint Conference on Artificial Intelligence

Country:

North America > United States > North Carolina > Durham County > Durham (0.04)
Europe > France (0.04)
Asia > Middle East > Lebanon (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Yang, Qiang (Hong Kong Hong Kong University of Science and Technology)

Activity Recognition: Linking Low-Level Sensors to High-Level Intelligence

Sensors provide computer systems with a window to the outside world. Activity recognition "sees" what is in the window to predict the locations, trajectories, actions, goals and plans of humans and objects. Building an activity recognition system requires a full range of interaction from statistical inference on lower level sensor data to symbolic AI at higher levels, where prediction results and acquired knowledge are passed up each level to form a knowledge food chain. In this article, I will give an overview of some of the current activity recognition research works and explore a life-cycle of learning and inference that allows the lowest-level radio-frequency signals to be transformed into symbolic logical representations for AI planning, which in turn controls the robots or guides human users through a sensor network, thus completing a full life-cycle of knowledge.

action model, recognition, sequence, (12 more...)

Twenty-First International Joint Conference on Artificial Intelligence

Country:

North America > United States > Illinois > Cook County > Chicago (0.05)
North America > United States > Oregon > Washington County > Hillsboro (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre: Overview (1.00)

Industry: Consumer Products & Services > Food, Beverage, Tobacco & Cannabis (0.34)

Technology:

Information Technology > Communications > Networks > Sensor Networks (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.68)
(2 more...)

Beygelzimer, Alina, Langford, John, Lifshits, Yuri, Sorkin, Gregory, Strehl, Alex

Conditional Probability Tree Estimation Analysis and Algorithms

arXiv.org Artificial IntelligenceJun-3-2009

We consider the problem of estimating the conditional probability of a label in time $O(\log n)$, where $n$ is the number of possible labels. We analyze a natural reduction of this problem to a set of binary regression problems organized in a tree structure, proving a regret bound that scales with the depth of the tree. Motivated by this analysis, we propose the first online algorithm which provably constructs a logarithmic depth tree on the set of labels to solve this problem. We test the algorithm empirically, showing that it works succesfully on a dataset with roughly $10^6$ labels.

artificial intelligence, inductive learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

0903.4217

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)

arXiv.org Artificial IntelligenceMay-27-2009

Characterizing predictable classes of processes

Ryabko, Daniil

The problem is sequence prediction in the following setting. A sequence $x_1,...,x_n,...$ of discrete-valued observations is generated according to some unknown probabilistic law (measure) $\mu$. After observing each outcome, it is required to give the conditional probabilities of the next observation. The measure $\mu$ belongs to an arbitrary class $\C$ of stochastic processes. We are interested in predictors $\rho$ whose conditional probabilities converge to the "true" $\mu$-conditional probabilities if any $\mu\in\C$ is chosen to generate the data. We show that if such a predictor exists, then a predictor can also be obtained as a convex combination of a countably many elements of $\C$. In other words, it can be obtained as a Bayesian predictor whose prior is concentrated on a countable set. This result is established for two very different measures of performance of prediction, one of which is very strong, namely, total variation, and the other is very weak, namely, prediction in expected average Kullback-Leibler divergence.

artificial intelligence, machine learning, predictor, (14 more...)

arXiv.org Artificial Intelligence

0905.4341

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)