AITopics

Exact inference in large, richly connected noisy-OR networks is intractable, and most approximate inference algorithms tend to concentrate on a small number of most probable configurations of the hidden variables under the posterior. We presented an "inclusive" variational method for bipartite noisy-OR networks that favors including all probable configurations, at the cost of including some improbable configurations. The method fits a tree to the posterior distribution sequentially, i.e., one observation at a time. Results on an ensemble of QMR-DT type networks show that the method performs better than local probability propagation and a variational upper bound for ranking most probable diseases.

algorithm, inference, symptom, (12 more...)

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > Jordan (0.09)
(3 more...)

Industry: Health & Medicine > Diagnostic Medicine (0.30)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.30)

Frey, Brendan J., Kannan, Anitha

Accumulator Networks: Suitors of Local Probability Propagation

One way to approximate inference in richly-connected graphical models is to apply the sum-product algorithm (a.k.a. The sum-product algorithm can be directly applied in Gaussian networks and in graphs for coding, but for many conditional probability functions - including the sigmoid function - direct application of the sum-product algorithm is not possible. We introduce "accumulator networks" that have low local complexity (but exponential global complexity) so the sum-product algorithm can be directly applied. In an accumulator network, the probability of a child given its parents is computed by accumulating the inputs from the parents in a Markov chain or more generally a tree. After giving expressions for inference and learning in accumulator networks, we give results on the "bars problem" and on the problem of extracting translated, overlapping faces from an image. 1 Introduction Graphical probability models with hidden variables are capable of representing complex dependencies between variables, filling in missing data and making Bayesoptimal decisions using probabilistic inferences (Hinton and Sejnowski 1986; Pearl 1988; Neal 1992).

accumulator network, inference, probability, (15 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > Jordan (0.05)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.91)

Discovering Hidden Variables: A Structure-Based Approach

Elidan, Gal, Lotner, Noam, Friedman, Nir, Koller, Daphne

A serious problem in learning probabilistic models is the presence of hidden variables. These variables are not observed, yet interact with several of the observed variables. As such, they induce seemingly complex dependencies among the latter. In recent years, much attention has been devoted to the development of algorithms for learning parameters, and in some cases structure, in the presence of hidden variables. In this paper, we address the related problem of detecting hidden variables that interact with the observed variables.

algorithm, bayesian network, procedure, (14 more...)

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
Banking & Finance > Trading (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.71)

Dugas, Charles, Bengio, Yoshua, Bélisle, François, Nadeau, Claude, Garcia, René

Incorporating Second-Order Functional Knowledge for Better Option Pricing

Incorporating prior knowledge of a particular task into the architecture of a learning algorithm can greatly improve generalization performance. We study here a case where we know that the function to be learned is non-decreasing in two of its arguments and convex in one of them. For this purpose we propose a class of functions similar to multi-layer neural networks but (1) that has those properties, (2) is a universal approximator of continuous functions with these and other properties. We apply this new class of functions to the task of modeling the price of call options. Experiments show improvements on regressing the price of call options using the new types of function classes that incorporate the a priori constraints.

derivative, neural network, universal approximator, (9 more...)

Country:

North America > Canada > Quebec > Montreal (0.05)
North America > United States > Massachusetts > Middlesex County > Medford (0.04)

Industry: Banking & Finance > Trading (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Domeniconi, Carlotta, Peng, Jing, Gunopulos, Dimitrios

An Adaptive Metric Machine for Pattern Classification

Nearest neighbor classification assumes locally constant class conditional probabilities. This assumption becomes invalid in high dimensions with finite samples due to the curse of dimensionality. Severe bias can be introduced under these conditions when using the nearest neighbor rule. We propose a locally adaptive nearest neighbor classification method to try to minimize bias. We use a Chi-squared distance analysis to compute a flexible metric for producing neighborhoods that are elongated along less relevant feature dimensions and constricted along most influential ones. As a result, the class conditional probabilities tend to be smoother in the modified neighborhoods, whereby better classification performance can be achieved. The efficacy of our method is validated and compared against other techniques using a variety of real world data. 1 Introduction

error rate, neighborhood, probability, (15 more...)

Country:

North America > United States > Oklahoma > Payne County > Stillwater (0.14)
North America > United States > California > Riverside County > Riverside (0.14)
North America > United States > South Carolina > Beaufort County > Hilton Head Island (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Dayan, Peter, Kakade, Sham

Explaining Away in Weight Space

Explaining away has mostly been considered in terms of inference of states in belief networks. We show how it can also arise in a Bayesian context in inference about the weights governing relationships such as those between stimuli and reinforcers in conditioning experiments such as bacA, 'Ward blocking. We show how explaining away in weight space can be accounted for using an extension of a Kalman filter model; provide a new approximate way of looking at the Kalman gain matrix as a whitener for the correlation matrix of the observation process; suggest a network implementation of this whitener using an architecture due to Goodall; and show that the resulting model exhibits backward blocking.

conditioning, covariance matrix, kalman filter, (14 more...)

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > New York (0.04)
Europe > Netherlands > South Holland > Dordrecht (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Csató, Lehel, Opper, Manfred

Sparse Representation for Gaussian Process Models

We develop an approach for a sparse representation for Gaussian Process (GP) models in order to overcome the limitations of GPs caused by large data sets. The method is based on a combination of a Bayesian online algorithm together with a sequential construction of a relevant subsample of the data which fully specifies the prediction of the model. Experimental results on toy examples and large real-world data sets indicate the efficiency of the approach.

approximation, gaussian process, vector, (15 more...)

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > West Midlands > Birmingham (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Modeling & Simulation (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Cohn, David A., Hofmann, Thomas

The Missing Link - A Probabilistic Model of Document Content and Hypertext Connectivity

We describe a joint probabilistic model for modeling the contents and inter-connectivity of document collections such as sets of web pages or research paper archives. The model is based on a probabilistic factor decomposition and allows identifying principal topics of the collection as well as authoritative documents within those topics. Furthermore, the relationships between topics is mapped out in order to build a predictive model of link content. Among the many applications of this approach are information retrieval and search, topic identification, query disambiguation, focused web crawling, web authoring, and bibliometric analysis.

joint model, probability, reference flow, (14 more...)

Country:

North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Communications > Web (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Chen, Scott Saobing, Gopinath, Ramesh A.

Gaussianization

High dimensional data modeling is difficult mainly because the so-called "curse of dimensionality". We propose a technique called "Gaussianization" for high dimensional density estimation, which alleviates the curse of dimensionality by exploiting the independence structures in the data. Gaussianization is motivated from recent developments in the statistics literature: projection pursuit, independent component analysis and Gaussian mixture models with semi-tied covariances. We propose an iterative Gaussianization procedure which converges weakly: at each iteration, the data is first transformed to the least dependent coordinates and then each coordinate is marginally Gaussianized by univariate techniques. Gaussianization offers density estimation sharper than traditional kernel methods and radial basis function methods.

algorithm, gaussianization, iteration, (13 more...)

Country: North America > United States > New York > Suffolk County > East Setauket (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.69)

Chapelle, Olivier, Weston, Jason, Bottou, Léon, Vapnik, Vladimir

Vicinal Risk Minimization

The Vicinal Risk Minimization principle establishes a bridge between generative models and methods derived from the Structural Risk Minimization Principle such as Support Vector Machines or Statistical Regularization. We explain how VRM provides a framework which integrates a number of existing algorithms, such as Parzen windows, Support Vector Machines, Ridge Regression, Constrained Logistic Classifiers and Tangent-Prop. We then show how the approach implies new algorithms for solving problems usually associated with generative models. New algorithms are described for dealing with pattern recognition problems with very different pattern distributions and dealing with unlabeled data. Preliminary empirical results are presented.

algorithm, logistic classifier, vrm, (11 more...)

Country:

North America > United States > Wisconsin (0.05)
North America > United States > New York (0.04)
North America > United States > Georgia > Chatham County > Savannah (0.04)
North America > United States > Colorado > Denver County > Denver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)