AITopics

We describe a new algorithm, Relaxed Survey Propagation (RSP), for finding MAP configurations in Markov random fields. We compare its performance with state-of-the-art algorithms including the max-product belief propagation, its sequential tree-reweightedvariant, residual (sum-product) belief propagation, and tree-structured expectation propagation. We show that it outperforms all approaches forIsing models with mixed couplings, as well as on a web person disambiguation task formulated as a supervised clustering problem.

artificial intelligence, configuration, machine learning, (18 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

Chapados, Nicolas, Bengio, Yoshua

Augmented Functional Time Series Representation and Forecasting with Gaussian Processes

Previous Work Gaussian processes for time-series forecasting have been considered before.

artificial intelligence, machine learning, modeling & simulation, (16 more...)

Country:

North America > United States (0.94)
North America > Canada > Quebec (0.14)

Genre: Research Report (0.46)

Industry:

Banking & Finance > Trading (1.00)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Parallelizing Support Vector Machines on Distributed Computers

Zhu, Kaihua, Wang, Hao, Bai, Hongjie, Li, Jian, Qiu, Zhihuan, Cui, Hang, Chang, Edward Y.

Support Vector Machines (SVMs) suffer from a widely recognized scalability problem in both memory use and computational time. To improve scalability, we have developed a parallel SVM algorithm (PSVM), which reduces memory use through performing a row-based, approximate matrix factorization, and which loads only essential data to each machine to perform parallel computation. Let $n$ denote the number of training instances, $p$ the reduced matrix dimension after factorization ($p$ is significantly smaller than $n$), and $m$ the number of machines. PSVM reduces the memory requirement from $\MO$($n^2$) to $\MO$($np/m$), and improves computation time to $\MO$($np^2/m$). Empirical studies on up to $500$ computers shows PSVM to be effective.

artificial intelligence, machine learning, speedup, (16 more...)

Country: Asia (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Cayton, Lawrence, Dasgupta, Sanjoy

A learning framework for nearest neighbor search

Can we leverage learning techniques to build a fast nearest-neighbor (NN) retrieval data structure? We present a general learning framework for the NN problem in which sample queries are used to learn the parameters of a data structure that minimize the retrieval time and/or the miss rate. We explore the potential of this novel framework through two popular NN data structures: KD-trees and the rectilinear structures employed by locality sensitive hashing. We derive a generalization theory for these data structure classes and present simple learning algorithms for both. Experimental results reveal that learning often improves on the already strong performance of these data structures.

information retrieval, machine learning, natural language, (17 more...)

Country: North America > United States > California (0.14)

Genre: Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.41)

Richardson, Fred, Campbell, William M.

Discriminative Keyword Selection Using Support Vector Machines

Many tasks in speech processing involve classification of long term characteristics of a speech segment such as language, speaker, dialect, or topic. A natural technique fordetermining these characteristics is to first convert the input speech into a sequence of tokens such as words, phones, etc. From these tokens, we can then look for distinctive sequences, keywords, that characterize the speech. In many applications, a set of distinctive keywords may not be known a priori. In this case, an automatic method of building up keywords from short context units such as phones is desirable. We propose a method for the construction of keywords based upon Support Vector Machines. We cast the problem of keyword selection as a feature selection problem for n-grams of phones. We propose an alternating filter-wrappermethod that builds successively longer keywords. Application of this method to language recognition and topic recognition tasks shows that the technique produces interesting and significant qualitative and quantitative results.

artificial intelligence, keyword, machine learning, (14 more...)

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Industry: Government > Regional Government (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Burghouts, Gertjan, Smeulders, Arnold, Geusebroek, Jan-mark

The Distribution Family of Similarity Distances

Assessing similarity between features is a key step in object recognition and scene categorization tasks. We argue that knowledge on the distribution of distances generated by similarity functions is crucial in deciding whether features are similar or not. Intuitively one would expect that similarities between features could arise from any distribution. In this paper, we will derive the contrary, and report the theoretical result that $L_p$-norms --a class of commonly applied distance metrics-- from one feature vector to other vectors are Weibull-distributed if the feature values are correlated and non-identically distributed. Besides these assumptions being realistic for images, we experimentally show them to hold for various popular feature extraction algorithms, for a diverse range of images. This fundamental insight opens new directions in the assessment of feature similarity, with projected improvements in object and scene recognition algorithms. Erratum: The authors of paper have declared that they have become convinced that the reasoning in the reference is too simple as a proof of their claims. As a consequence, they withdraw their theorems.

artificial intelligence, data mining, machine learning, (19 more...)

Country: Europe > Netherlands (0.14)

Genre: Research Report (0.66)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Data Science > Data Mining > Feature Extraction (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.42)

Bradley, Joseph K., Schapire, Robert E.

FilterBoost: Regression and Classification on Large Datasets

We study boosting in the filtering setting, where the booster draws examples from an oracle instead of using a fixed training set and so may train efficiently on very large datasets. Our algorithm, which is based on a logistic regression technique proposed by Collins, Schapire, & Singer, requires fewer assumptions to achieve bounds equivalent to or better than previous work. Moreover, we give the first proof that the algorithm of Collins et al. is a strong PAC learner, albeit within the filtering setting. Our proofs demonstrate the algorithm's strong theoretical properties forboth classification and conditional probability estimation, and we validate these results through extensive experiments. Empirically, our algorithm proves more robust to noise and overfitting than batch boosters in conditional probability estimation and proves competitive in classification.

artificial intelligence, filterboost, machine learning, (18 more...)

Country: North America > United States (0.93)

Genre: Research Report > New Finding (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)

Bottou, Léon, Bousquet, Olivier

The Tradeoffs of Large Scale Learning

This contribution develops a theoretical framework that takes into account the effect of approximate optimization on learning algorithms. The analysis shows distinct tradeoffs for the case of small-scale and large-scale learning problems. Small-scale learning problems are subject to the usual approximation--estimation tradeoff. Large-scale learning problems are subject to a qualitatively different tradeoff involving the computational complexity of the underlying optimization algorithms in non-trivial ways.

algorithm, artificial intelligence, machine learning, (17 more...)

Country:

North America > United States (0.68)
Europe > Switzerland > Zürich > Zürich (0.14)

Industry: Education > Focused Education > Special Education (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Bonilla, Edwin V., Chai, Kian M., Williams, Christopher

Multi-task Gaussian Process Prediction

In this paper we investigate multi-task learning in the context of Gaussian Processes (GP).We propose a model that learns a shared covariance function on input-dependent features and a "free-form" covariance matrix over tasks. This allows forgood flexibility when modelling inter-task dependencies while avoiding the need for large amounts of data for training. We show that under the assumption ofnoise-free observations and a block design, predictions for a given task only depend on its target values and therefore a cancellation of inter-task transfer occurs.We evaluate the benefits of our model on two practical applications: a compiler performance prediction problem and an exam score prediction task. Additionally, we make use of GP approximations and properties of our model in order to provide scalability to large data sets.

artificial intelligence, covariance function, machine learning, (14 more...)

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report (0.46)

Industry: Education (0.69)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

Feature Selection Methods for Improving Protein Structure Prediction with Rosetta

Blum, Ben, Baker, David, Jordan, Michael I., Bradley, Philip, Das, Rhiju, Kim, David E.

Rosetta is one of the leading algorithms for protein structure prediction today. It is a Monte Carlo energy minimization method requiring many random restarts to find structures with low energy. In this paper we present a resampling technique for structure prediction of small alpha/beta proteins using Rosetta. From an initial roundof Rosetta sampling, we learn properties of the energy landscape that guide a subsequent round of sampling toward lower-energy structures. Rather than attempt to fit the full energy landscape, we use feature selection methods--both L1-regularized linear regression and decision trees--to identify structural features that give rise to low energy. We then enrich these structural features in the second sampling round. Results are presented across a benchmark set of nine small alpha/beta proteinsdemonstrating that our methods seldom impair, and frequently improve, Rosetta's performance.

artificial intelligence, machine learning, rosetta, (19 more...)

Country:

North America > United States (0.46)
Asia > Middle East > Jordan (0.14)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)