AITopics

The Karush-Kuhn-Tucker Theorem is used to derive conditions for determining whether or not a given working set is optimal. These conditions become the algorithm)s termination criteria) as an alternative to Osuna)s criteria (also used by Joachims without modification) which used conditions for individual points. The advantage of the new conditions is that knowledge of the hyperplane)s constant factor b) which in some cases is difficult to compute) is not required. Further investigation of the new termination conditions allows to form the strategy for selecting an optimal working set. The new algorithm is applicable to the pattern recognition SVM) and is provably equivalent to Joachims) algorithm. One can also interpret the new algorithm in the sense of the method of feasible directions. Experimental results presented in the last section demonstrate superior performance of the new method in comparison with traditional training of regression SVM. 2 General Principles of Regression SVM Decomposition The original decomposition algorithm proposed for the pattern recognition SVM in [2] has been extended to the regression SVM in [4]. For the sake of completeness I will repeat the main steps of this extension with the aim of providing terse and streamlined notation to lay the ground for working set selection.

algorithm, improved decomposition algorithm, support vector machine, (8 more...)

Country:

North America > United States > Delaware > New Castle County > Newark (0.14)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.70)

Carreira-Perpiñán, Miguel Á.

Reconstruction of Sequential Data with Probabilistic Models and Continuity Constraints

We consider the problem of reconstructing a temporal discrete sequence of multidimensional real vectors when part of the data is missing, under the assumption that the sequence was generated by a continuous process. A particular case of this problem is multivariate regression, which is very difficult when the underlying mapping is one-to-many. We propose an algorithm based on a joint probability model of the variables of interest, implemented using a nonlinear latent variable model. Each point in the sequence is potentially reconstructed as any of the modes of the conditional distribution of the missing variables given the present variables (computed using an exhaustive mode search in a Gaussian mixture). Mode selection is determined by a dynamic programming search that minimises a geometric measure of the reconstructed sequence, derived from continuity constraints. We illustrate the algorithm with a toy example and apply it to a real-world inverse problem, the acoustic-toarticulatory mapping. The results show that the algorithm outperforms conditional mean imputation and multilayer perceptrons. 1 Definition of the problem

conditional distribution, mapping, sequence, (14 more...)

Country:

Europe > United Kingdom > England > South Yorkshire > Sheffield (0.05)
Asia > Middle East > Jordan (0.05)
North America > United States > Wisconsin (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Bayesian Averaging is Well-Temperated

Hansen, Lars Kai

Often a learning problem has natural quantitative measure of generalization. If a loss function is defined the natural measure is the generalization error, i.e., the expected loss on a random sample independent of the training set. Generalizability is a key topic of learning theory and much progress has been reported. Analytic results for a broad class of machines can be found in the litterature [8, 12, 9, 10] describing the asymptotic generalization ability of supervised algorithms that are continuously parameterized. Asymptotic bounds on generalization for general machines have been advocated by Vapnik [11]. Generalization results valid for finite training sets can only be obtained for specific learning machines, see e.g.

generalization error, predictive distribution, teacher distribution, (14 more...)

Country:

North America > United States > New York (0.05)
North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
Europe > Denmark > Capital Region > Kongens Lyngby (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.32)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.31)

Csató, Lehel, Fokoué, Ernest, Opper, Manfred, Schottky, Bernhard, Winther, Ole

Efficient Approaches to Gaussian Process Classification

The first two methods are related to mean field ideas known in Statistical Physics. The third approach is based on Bayesian online approach which was motivated by recent results in the Statistical Mechanics of Neural Networks. We present simulation results showing: 1. that the mean field Bayesian evidence may be used for hyperparameter tuning and 2. that the online approach may achieve a low training error fast. 1 Introduction Gaussian processes provide promising nonparametric Bayesian approaches to regression and classification [2, 1].

approximation, free energy, likelihood, (14 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
Europe > Sweden > Skåne County > Lund (0.04)

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Hyvärinen, Aapo, Hoyer, Patrik O.

Emergence of Topography and Complex Cell Properties from Natural Images using Extensions of ICA

Independent component analysis of natural images leads to emergence of simple cell properties, Le. linear filters that resemble wavelets or Gabor functions. In this paper, we extend ICA to explain further properties of VI cells.

basis vector, complex cell, subspace, (13 more...)

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > New York (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Tonkes, Bradley, Blair, Alan, Wiles, Janet

Evolving Learnable Languages

Recent theories suggest that language acquisition is assisted by the evolution of languages towards forms that are easily learnable. In this paper, we evolve combinatorial languages which can be learned by a recurrent neural network quickly and from relatively few examples. Additionally, we evolve languages for generalization in different "worlds", and for generalization from specific examples. We find that languages can be evolved to facilitate different forms of impressive generalization for a minimally biased, general purpose learner. The results provide empirical support for the theory that the language itself, as well as the language environment of a learner, plays a substantial role in learning: that there is far more to language acquisition than the language acquisition device.

decoder, encoder, learner, (15 more...)

Country:

Oceania > Australia > Queensland (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
North America > United States > New York (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)

Schoner, Holger, Stetter, Martin, Schießl, Ingo, Mayhew, John E. W., Lund, Jennifer S., McLoughlin, Niall, Obermayer, Klaus

Application of Blind Separation of Sources to Optical Recording of Brain Activity

In the analysis of data recorded by optical imaging from intrinsic signals (measurement of changes of light reflectance from cortical tissue) the removal of noise and artifacts such as blood vessel patterns is a serious problem. Often bandpass filtering is used, but the underlying assumption that a spatial frequency exists, which separates the mapping component from other components (especially the global signal), is questionable. Here we propose alternative ways of processing optical imaging data, using blind source separation techniques based on the spatial decorre1ation of the data. We first perform benchmarks on artificial data in order to select the way of processing, which is most robust with respect to sensor noise. We then apply it to recordings of optical imaging experiments from macaque primary visual cortex. We show that our BSS technique is able to extract ocular dominance and orientation preference maps from single condition stacks, for data, where standard post-processing procedures fail. Artifacts, especially blood vessel patterns, can often be completely removed from the maps. In summary, our method for blind source separation using extended spatial decorrelation is a superior technique for the analysis of optical recording data.

algorithm, mapping component, matrix, (13 more...)

Country:

Europe > Germany > Berlin (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(3 more...)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.48)

Parra, Lucas C., Spence, Clay, Sajda, Paul, Ziehe, Andreas, Müller, Klaus-Robert

Unmixing Hyperspectral Data

In hyperspectral imagery one pixel typically consists of a mixture of the reflectance spectra of several materials, where the mixture coefficients correspond to the abundances of the constituting materials. We assume linear combinations of reflectance spectra with some additive normal sensor noise and derive a probabilistic MAP framework for analyzing hyperspectral data. As the material reflectance characteristics are not know a priori, we face the problem of unsupervised linear unmixing.

abundance, algorithm, endmember, (15 more...)

Country:

North America > United States > Nevada (0.05)
North America > United States > Washington > Whatcom County > Bellingham (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
(4 more...)

Industry:

Energy (0.49)
Government > Regional Government > North America Government > United States Government (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.47)

Mozer, Michael C., Wolniewicz, Richard H., Grimes, David B., Johnson, Eric, Kaushansky, Howard

Churn Reduction in the Wireless Industry

Competition in the wireless telecommunications industry is rampant. To maintain profitability, wireless carriers must control chum, the loss of subscribers who switch from one carrier to another. We explore statistical techniques for chum prediction and, based on these predictions.

incentive, representation, subscriber, (14 more...)

Country:

North America > United States > Colorado > Boulder County > Boulder (0.14)
North America > United States > New York (0.05)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Finland (0.04)

Industry: Telecommunications (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)

Mjolsness, Eric, Mann, Tobias, Castaño, Rebecca, Wold, Barbara J.

From Coexpression to Coregulation: An Approach to Inferring Transcriptional Regulation among Gene Classes from Large-Scale Expression Data

We provide preliminary evidence that eXlstmg algorithms for inferring small-scale gene regulation networks from gene expression data can be adapted to large-scale gene expression data coming from hybridization microarrays. The essential steps are (1) clustering many genes by their expression time-course data into a minimal set of clusters of co-expressed genes, (2) theoretically modeling the various conditions under which the time-courses are measured using a continious-time analog recurrent neural network for the cluster mean time-courses, (3) fitting such a regulatory model to the cluster mean time courses by simulated annealing with weight decay, and (4) analysing several such fits for commonalities in the circuit parameter sets including the connection matrices. This procedure can be used to assess the adequacy of existing and future gene expression time-course data sets for determ ining transcriptional regulatory relationships such as coregulation.

algorithm, inferring transcriptional regulation, time course, (14 more...)

Country:

North America > United States > California > Los Angeles County > Pasadena (0.05)
North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
North America > United States > Connecticut > New Haven County > New Haven (0.04)
Europe > Denmark (0.04)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)