AITopics

In supervised learning there is a typical presumption that the training and test points are taken from the same distribution. In practice this assumption is commonly violated. The situations where the training and test data are from different distributions is called covariate shift. Recent work has examined techniques for dealing with covariate shift in terms of minimisation of generalisation error. As yet the literature lacks a Bayesian generative perspective on this problem. This paper tackles this issue for regression models. Recent work on covariate shift can be understood in terms of mixture regression. Using this view, we obtain a general approach to regression under covariate shift, which reproduces previous work as a special case. The main advantages of this new formulation over previous models for covariate shift are that we no longer need to presume the test and training densities are known, the regression and density estimation are combined into a single procedure, and previous methods are reproduced as special cases of this procedure, shedding light on the implicit assumptions the methods are making.

covariate shift, regressor, test data, (15 more...)

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Taylor, Graham W., Hinton, Geoffrey E., Roweis, Sam T.

Modeling Human Motion Using Binary Latent Variables

We propose a nonlinear generative model for human motion data that uses an undirected model with binary latent variables and real-valued "visible" variables that represent joint angles. The latent and visible variables at each time step receive directed connections from the visible variables at the last few time-steps. Such an architecture makes online inference efficient and allows us to use a simple approximate learning procedure. After training, the model finds a single set of parameters that simultaneously capture several different kinds of motion. We demonstrate the power of our approach by synthesizing various motion sequences and by performing online filling in of data lost during motion capture.

sequence, transition, visible unit, (16 more...)

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Simonotto, Enrico, Whalley, Heather, Lawrie, Stephen, Murray, Lawrence, Mcgonigle, David, Storkey, Amos J.

Learning Structural Equation Models for fMRI

Structural equation models can be seen as an extension of Gaussian belief networks to cyclic graphs, and we show they can be understood generatively as the model for the joint distribution of long term average equilibrium activity of Gaussian dynamic belief networks. Most use of structural equation models in fMRI involves postulating a particular structure and comparing learnt parameters across different groups. In this paper it is argued that there are situations where priors about structure are not firm or exhaustive, and given sufficient data, it is worth investigating learning network structure as part of the approach to connectivity analysis. First we demonstrate structure learning on a toy problem. We then show that for particular fMRI data the simple models usually assumed are not supported. We show that is is possible to learn sensible structural equation models that can provide modelling benefits, but that are not necessarily going to be the same as a true causal model, and suggest the combination of prior models and learning or the use of temporal information from dynamic models may provide more benefits than learning structural equations alone.

connectivity, equation model, structural equation model, (13 more...)

Country:

Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany (0.04)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Steinwart, Ingo, Hush, Don, Scovel, Clint

An Oracle Inequality for Clipped Regularized Risk Minimizers

We establish a general oracle inequality for clipped approximate minimizers of regularized empirical risks and apply this inequality to support vector machine (SVM) type algorithms. We then show that for SVMs using Gaussian RBF kernels for classification this oracle inequality leads to learning rates that are faster than the ones established in [9]. Finally, we use our oracle inequality to show that a simple parameter selection approach based on a validation set can yield the same fast learning rates without knowing the noise exponents which were required to be known a-priori in [9].

inequality, oracle inequality, theorem 2, (12 more...)

Country:

North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.57)

Steinke, Florian, Blanz, Volker, Schölkopf, Bernhard

Learning Dense 3D Correspondence

Establishing correspondence between distinct objects is an important and nontrivial task: correctness of the correspondence hinges on properties which are difficult to capture in an a priori criterion. While previous work has used a priori criteria which in some cases led to very good results, the present paper explores whether it is possible to learn a combination of features that, for a given training set of aligned human heads, characterizes the notion of correct correspondence. By optimizing this criterion, we are then able to compute correspondence and morphs for novel heads.

algorithm, correspondence, feature function, (16 more...)

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.04)
Europe > Germany > North Rhine-Westphalia > Arnsberg Region > Siegen (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Sohn, Kyung-ah, Xing, Eric P.

Hidden Markov Dirichlet Process: Modeling Genetic Recombination in Open Ancestral Space

We present a new statistical framework called hidden Markov Dirichlet process (HMDP) to jointly model the genetic recombinations among possibly infinite number of founders and the coalescence-with-mutation events in the resulting genealogies. The HMDP posits that a haplotype of genetic markers is generated by a sequence of recombination events that select an ancestor for each locus from an unbounded set of founders according to a 1st-order Markov transition process. Conjoining this process with a mutation model, our method accommodates both between-lineage recombination and within-lineage sequence variations, and leads to a compact and natural interpretation of the population structure and inheritance process underlying haplotype data. We have developed an efficient sampling algorithm for HMDP based on a two-level nested Pólya urn scheme. On both simulated and real SNP haplotype data, our method performs competitively or significantly better than extant methods in uncovering the recombination hotspots along chromosomal loci; and in addition it also infers the ancestral genetic patterns and offers a highly accurate map of ancestral compositions of modern populations.

ancestor, haplotype, hmdp, (14 more...)

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > Utah (0.04)
(2 more...)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Sing, Tobias, Beerenwinkel, Niko

Mutagenetic tree Fisher kernel improves prediction of HIV drug resistance from viral genotype

Starting with the work of Jaakkola and Haussler, a variety of approaches have been proposed for coupling domain-specific generative models with statistical learning methods. The link is established by a kernel function which provides a similarity measure based inherently on the underlying model. In computational biology, the full promise of this framework has rarely ever been exploited, as most kernels are derived from very generic models, such as sequence profiles or hidden Markov models. Here, we introduce the MTreeMix kernel, which is based on a generative model tailored to the underlying biological mechanism.

fisher kernel, kernel, resistance, (16 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology > HIV (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Shelton, Christian R., Huie, Wesley, Kan, Kin F.

Chained Boosting

We describe a method to learn to make sequential stopping decisions, such as those made along a processing pipeline. We envision a scenario in which a series of decisions must be made as to whether to continue to process. Further processing costs time and resources, but may add value. Our goal is to create, based on historic data, a series of decision rules (one at each stage in the pipeline) that decide, based on information gathered up to that point, whether to continue processing the part. We demonstrate how our framework encompasses problems from manufacturing to vision processing. We derive a quadratic (in the number of decisions) bound on testing performance and provide empirical results on object detection.

algorithm, classifier, resolution, (17 more...)

Country:

North America > United States > California > Riverside County > Riverside (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.96)

Shan, Honghao, Zhang, Lingyun, Cottrell, Garrison W.

Recursive ICA

Independent Component Analysis (ICA) is a popular method for extracting independent features from visual data. However, as a fundamentally linear technique, there is always nonlinear residual redundancy that is not captured by ICA. Hence there have been many attempts to try to create a hierarchical version of ICA, but so far none of the approaches have a natural way to apply them more than once. Here we show that there is a relatively simple technique that transforms the absolute values of the outputs of a previous application of ICA into a normal distribution, to which ICA maybe applied again. This results in a recursive ICA algorithm that may be applied any number of times in order to extract higher order structure from previous layers.

algorithm, basis function, marginal distribution, (16 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)

Industry: Health & Medicine (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Shalev-shwartz, Shai, Singer, Yoram

Convex Repeated Games and Fenchel Duality

We describe an algorithmic framework for an abstract game which we term a convex repeated game. We show that various online learning and boosting algorithms can be all derived as special cases of our algorithmic framework. This unified view explains the properties of existing algorithms and also enables us to derive several new interesting algorithms. Our algorithmic framework stems from a connection that we build between the notions of regret in game theory and weak duality in convex optimization.

algorithm, convex, hypothesis, (15 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > California > Santa Clara County > Mountain View (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Industry: Leisure & Entertainment (0.48)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)