AITopics

We provide some insights into how task correlations in multi-task Gaussian process (GP) regression affect the generalization error and the learning curve. We analyze the asymmetric two-task case, where a secondary task is to help the learning of a primary task. Within this setting, we give bounds on the generalization error and the learning curve of the primary task. Our approach admits intuitive understandings of the multi-task GP by relating it to single-task GPs. For the case of one-dimensional input-space under optimal sampling with data only for the secondary task, the limitations of multi-task GP can be quantified explicitly.

generalization error, posterior variance, variance, (16 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > New York (0.04)
Europe > United Kingdom (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Efficient Bregman Range Search

Cayton, Lawrence

We develop an algorithm for efficient range search when the notion of dissimilarity is given by a Bregman divergence. The range search task is to return all points in a potentially large database that are within some specified distance of a query. It arises in many learning algorithms such as locally-weighted regression, kernel density estimation, neighborhood graph-based algorithms, and in tasks like outlier detection and information retrieval. In metric spaces, efficient range search-like algorithms based on spatial data structures have been deployed on a variety of statistical tasks. Here we describe the first algorithm for range search for an arbitrary Bregman divergence. This broad class of dissimilarity measures includes the relative entropy, Mahalanobis distance, Itakura-Saito divergence, and a variety of matrix divergences. Metric methods cannot be directly applied since Bregman divergences do not in general satisfy the triangle inequality. We derive geometric properties of Bregman divergences that yield an efficient algorithm for range search based on a recently proposed space decomposition for Bregman divergences.

algorithm, bregman divergence, divergence, (16 more...)

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > Russia (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
Asia > Russia (0.04)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Brasselet, Romain, Johansson, Roland, Arleo, Angelo

Optimal context separation of spiking haptic signals by second-order somatosensory neurons

We study an encoding/decoding mechanism accounting for the relative spike timing of the signals propagating from peripheral nerve fibers to second-order somatosensory neurons in the cuneate nucleus (CN). The CN is modeled as a population of spiking neurons receiving as inputs the spatiotemporal responses of real mechanoreceptors obtained via microneurography recordings in humans. The efficiency of the haptic discrimination process is quantified by a novel definition of entropy that takes into full account the metrical properties of the spike train space. This measure proves to be a suitable decoding scheme for generalizing the classical Shannon entropy to spike-based neural codes. It permits an assessment of neurotransmission in the presence of a large output space (i.e. hundreds of spike trains) with 1 ms temporal precision. It is shown that the CN population code performs a complete discrimination of 81 distinct stimuli already within 35 ms of the first afferent spike, whereas a partial discrimination (80% of the maximum information transmission) is possible as rapidly as 15 ms. This study suggests that the CN may not constitute a mere synaptic relay along the somatosensory pathway but, rather, it may convey optimal contextual accounts (in terms of fast and reliable information transfer) of peripheral tactile inputs to downstream structures of the central nervous system.

discrimination, information, neuron, (15 more...)

Country:

North America > United States (0.14)
Europe > Sweden > Västerbotten County > Umeå (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > Experimental Study (0.34)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Bouchard-côté, Alexandre, Petrov, Slav, Klein, Dan

Randomized Pruning: Efficiently Calculating Expectations in Large Dynamic Programs

Pruning can massively accelerate the computation of feature expectations in large models. However, any single pruning mask will introduce bias. We present a novel approach which employs a randomized sequence of pruning masks. Formally, we apply auxiliary variable MCMC sampling to generate this sequence of masks, thereby gaining theoretical guarantees about convergence. Because each mask is generally able to skip large portions of an underlying dynamic program, our approach is particularly compelling for high-degree algorithms. Empirically, we demonstrate our method on bilingual parsing, showing decreasing bias as more masks are incorporated, and outperforming fixed tic-tac-toe pruning.

auxiliary variable, computation, dynamic program, (17 more...)

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.48)

Industry: Leisure & Entertainment (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Amini, Massih, Usunier, Nicolas, Goutte, Cyril

Learning from Multiple Partially Observed Views - an Application to Multilingual Text Categorization

We address the problem of learning classifiers when observations have multiple views, some of which may not be observed for all examples. We assume the existence of view generating functions which may complete the missing views in an approximate way. This situation corresponds for example to learning text classifiers from multilingual collections where documents are not available in all languages. In that case, Machine Translation (MT) systems may be used to translate each document in the missing languages. We derive a generalization error bound for classifiers learned on examples with multiple artificially created views. Our result uncovers a trade-off between the size of the training set, the number of views, and the quality of the view generating functions. As a consequence, we identify situations where it is more interesting to use multiple views for learning instead of classical single view learning. An extension of this framework is a natural way to leverage unlabeled multi-view data in semi-supervised learning. Experimental results on a subset of the Reuters RCV1/RCV2 collections support our findings by showing that additional views obtained from MT may significantly improve the classification performance in the cases identified by our trade-off.

classifier, view generating function, view-specific classifier, (14 more...)

Country:

North America > Canada (0.04)
North America > United States > New York (0.04)
Europe > France (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Ailon, Nir, Jaiswal, Ragesh, Monteleoni, Claire

Streaming k-means approximation

We provide a clustering algorithm that approximately optimizes the k-means objective, in the one-pass streaming setting. We make no assumptions about the data, and our algorithm is very light-weight in terms of memory, and computation. This setting is applicable to unsupervised learning on massive data sets, or resource-constrained devices. The two main ingredients of our theoretical work are: a derivation of an extremely simple pseudo-approximation batch algorithm for k-means, in which the algorithm is allowed to output more than k centers (based on the recent k-means++"), and a streaming clustering algorithm in which batch clustering algorithms are performed on small inputs (fitting in memory) and combined in a hierarchical manner. Empirical evaluations on real and simulated data reveal the practical utility of our method."

algorithm, approximation algorithm, k-means objective, (16 more...)

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.05)
South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > Pennsylvania (0.04)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Agarwal, Alekh, Wainwright, Martin J., Bartlett, Peter L., Ravikumar, Pradeep K.

Information-theoretic lower bounds on the oracle complexity of convex optimization

Despite the large amount of literature on upper bounds on complexity of convex analysis, surprisingly little is known about the fundamental hardness of these problems. The extensive use of convex optimization in machine learning and statistics makes such an understanding critical to understand fundamental computational limits of learning and estimation. In this paper, we study the complexity of stochastic convex optimization in an oracle model of computation. We improve upon known results and obtain tight minimax complexity estimates for some function classes. We also discuss implications of these results to the understanding the inherent complexity of large-scale learning and estimation problems.

complexity, optimization, oracle, (15 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.47)

Nguyen-tuong, Duy, Peters, Jan R., Seeger, Matthias

Local Gaussian Process Regression for Real Time Online Model Learning

Learning in real-time applications, e.g., online approximation of the inverse dynamics model for model-based robot control, requires fast online regression techniques. Inspired by local learning, we propose a method to speed up standard Gaussian Process regression (GPR) with local GP models (LGP). The training data is partitioned in local regions, for each an individual GP model is trained. The prediction for a query point is performed by weighted estimation using nearby local models. Unlike other GP approximations, such as mixtures of experts, we use a distance based measure for partitioning of the data and weighted prediction. The proposed method achieves online learning and prediction in real-time. Comparisons with other nonparametric regression methods show that LGP has higher accuracy than LWPR and close to the performance of standard GPR and nu-SVR.

artificial intelligence, local model, machine learning, (16 more...)

Country: Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre: Instructional Material > Online (0.41)

Industry: Education (0.58)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)

Levy, Roger P., Reali, Florencia, Griffiths, Thomas L.

Modeling the effects of memory on human online sentence processing with particle filters

Language comprehension in humans is significantly constrained by memory, yet rapid, highly incremental, and capable of utilizing a wide range of contextual information to resolve ambiguity and form expectations about future input. In contrast, most of the leading psycholinguistic models and fielded algorithms for natural language parsing are non-incremental, have run time superlinear in input length, and/or enforce structural locality constraints on probabilistic dependencies between events. We present a new limited-memory model of sentence comprehension which involves an adaptation of the particle filter, a sequential Monte Carlo method, to the problem of incremental parsing. We show that this model can reproduce classic results in online sentence comprehension, and that it naturally provides the first rational account of an outstanding problem in psycholinguistics, in which the preferred alternative in a syntactic ambiguity seems to grow more attractive over time even in the absence of strong disambiguating information.

machine learning, natural language, particle filter, (19 more...)

Country: North America > United States > California (0.28)

Genre: Research Report (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Kuksa, Pavel P., Huang, Pai-hsi, Pavlovic, Vladimir

Scalable Algorithms for String Kernels with Inexact Matching

We present a new family of linear time algorithms based on sufficient statistics for string comparison with mismatches under the string kernels framework. Our algorithms improve theoretical complexity bounds of existing approaches while scaling well with respect to the sequence alphabet size, the number of allowed mismatches and the size of the dataset. In particular, on large alphabets with loose mismatch constraints our algorithms are several orders of magnitude faster than the existing algorithms for string comparison under the mismatch similarity measure. We evaluate our algorithms on synthetic data and real applications in music genre classification, protein remote homology detection and protein fold prediction. The scalability of the algorithms allows us to consider complex sequence transformations, modeled using longer string features and larger numbers of mismatches, leading to a state-of-the-art performance with significantly reduced running times.

artificial intelligence, data mining, machine learning, (15 more...)

Country: North America > United States (0.29)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.96)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)