AITopics | Statistical Learning

Collaborating Authors

Statistical Learning

News Overviews Instructional Materials AI-Alerts Classics

How Experience of the Body Shapes Language about Space

Steels, Luc L. (Sony Computer Science Laboratory) | Spranger, Michael (Sony Computer Science Laboratory Paris)

AAAI ConferencesJun-23-2009

Open-ended language communication remains an enormous challenge for autonomous robots. This paper argues that the notion of a language strategy is the appropriate vehicle for addressing this challenge. A language strategy packages all the procedures that are necessary for playing a language game. We present a specific example of a language strategy for playing an Action Game in which one robot asks another robot to take on a body posture (such as stand or sit), and show how it effectively allows a population of agents to self-organise a perceptually grounded ontology and a lexicon from scratch, without any human intervention. Next, we show how a new language strategy can arise by exaptation from an existing one, concretely, how the body posture strategy can be exapted to a strategy for playing language games about the spatial position of objects (as in "the bottle stands on the table").

body posture, language strategy, robot, (14 more...)

AAAI Conferences

Twenty-First International Joint Conference on Artificial Intelligence

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(3 more...)

Genre: Research Report (0.66)

Industry: Leisure & Entertainment > Games (0.49)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.36)

Add feedback

Machine Learning in Ecosystem Informatics and Sustainability

Dietterich, Thomas G. (Oregon State University)

AAAI ConferencesJun-23-2009

Ecosystem Informatics brings together mathematical and computational tools to address scientific and policy challenges in the ecosystem sciences. These challenges include novel sensors for collecting data, algorithms for automated data cleaning, learning methods for building statistical models from data and for fitting mechanistic models to data, and algorithms for designing optimal policies for biosphere management. This presentation discusses these challenges and then describes recent work on the first two of these--new methods for automated arthropod population counting and linear Gaussian DBNs for automated cleaning of sensor network data.

ecosystem informatic, histogram, sensor, (14 more...)

AAAI Conferences

Twenty-First International Joint Conference on Artificial Intelligence

Country:

North America > United States > Oregon (0.05)
Europe > Switzerland > Vaud > Lausanne (0.04)

Genre: Research Report (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.94)
(2 more...)

Add feedback

The Feature Importance Ranking Measure

Zien, Alexander, Kraemer, Nicole, Sonnenburg, Soeren, Raetsch, Gunnar

arXiv.org Machine LearningJun-23-2009

Most accurate predictions are typically obtained by learning machines with complex feature spaces (as e.g. induced by kernels). Unfortunately, such decision rules are hardly accessible to humans and cannot easily be used to gain insights about the application domain. Therefore, one often resorts to linear models in combination with variable selection, thereby sacrificing some predictive power for presumptive interpretability. Here, we introduce the Feature Importance Ranking Measure (FIRM), which by retrospective analysis of arbitrary learning machines allows to achieve both excellent predictive performance and superior interpretation. In contrast to standard raw feature weighting, FIRM takes the underlying correlation structure of the features into account. Thereby, it is able to discover the most relevant features, even if their appearance in the training data is entirely prevented by noise. The desirable properties of FIRM are investigated analytically and illustrated in simulations.

artificial intelligence, feature importance, machine learning, (15 more...)

arXiv.org Machine Learning

0906.4258

Genre: Research Report (0.90)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

Add feedback

KNIFE: Kernel Iterative Feature Extraction

Allen, Genevera I.

arXiv.org Machine LearningJun-23-2009

Selecting important features in non-linear or kernel spaces is a difficult challenge in both classification and regression problems. When many of the features are irrelevant, kernel methods such as the support vector machine and kernel ridge regression can sometimes perform poorly. We propose weighting the features within a kernel with a sparse set of weights that are estimated in conjunction with the original classification or regression problem. The iterative algorithm, KNIFE, alternates between finding the coefficients of the original problem and finding the feature weights through kernel linearization. In addition, a slight modification of KNIFE yields an efficient algorithm for finding feature regularization paths, or the paths of each feature's weight. Simulation results demonstrate the utility of KNIFE for both kernel regression and support vector machines with a variety of kernels. Feature path realizations also reveal important non-linear correlations among features that prove useful in determining a subset of significant variables. Results on vowel recognition data, Parkinson's disease data, and microarray data are also given.

artificial intelligence, kernel, machine learning, (17 more...)

arXiv.org Machine Learning

0906.4391

Country: North America > United States (0.93)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.87)
Health & Medicine > Therapeutic Area (0.56)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Add feedback

Transductive Rademacher Complexity and its Applications

El-Yaniv, R., Pechyony, D.

Journal of Artificial Intelligence ResearchJun-22-2009

We develop a technique for deriving data-dependent error bounds for transductive learning algorithms based on transductive Rademacher complexity. Our technique is based on a novel general error bound for transduction in terms of transductive Rademacher complexity, together with a novel bounding technique for Rademacher averages for particular algorithms, in terms of their "unlabeled-labeled" representation. This technique is relevant to many advanced graph-based transductive algorithms and we demonstrate its effectiveness by deriving error bounds to three well known algorithms. Finally, we present a new PAC-Bayesian bound for mixtures of transductive algorithms based on our Rademacher bounds.

algorithm, inequality, rademacher complexity, (14 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.2587

AI Access Foundation

10608

Journal of Artificial Intelligence Research

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
(4 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)

Add feedback

Forest Garrote

Meinshausen, Nicolai

arXiv.org Machine LearningJun-19-2009

Variable selection for high-dimensional linear models has received a lot of attention lately, mostly in the context of l1-regularization. Part of the attraction is the variable selection effect: parsimonious models are obtained, which are very suitable for interpretation. In terms of predictive power, however, these regularized linear models are often slightly inferior to machine learning procedures like tree ensembles. Tree ensembles, on the other hand, lack usually a formal way of variable selection and are difficult to visualize. A Garrote-style convex penalty for trees ensembles, in particular Random Forests, is proposed. The penalty selects functional groups of nodes in the trees. These could be as simple as monotone functions of individual predictor variables. This yields a parsimonious function fit, which lends itself easily to visualization and interpretation. The predictive power is maintained at least at the same level as the original tree ensemble. A key feature of the method is that, once a tree ensemble is fitted, no further tuning parameter needs to be selected. The empirical performance is demonstrated on a wide array of datasets.

artificial intelligence, forest garrote, machine learning, (19 more...)

arXiv.org Machine Learning

0906.3590

Country: Europe (0.28)

Genre: Research Report (0.64)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Add feedback

P-values for high-dimensional regression

Meinshausen, Nicolai, Meier, Lukas, Bühlmann, Peter

arXiv.org Machine LearningJun-12-2009

Assigning significance in high-dimensional regression is challenging. Most computationally efficient selection algorithms cannot guard against inclusion of noise variables. Asymptotically valid p-values are not available. An exception is a recent proposal by Wasserman and Roeder (2008) which splits the data into two parts. The number of variables is then reduced to a manageable size using the first split, while classical variable selection techniques can be applied to the remaining variables, using the data from the second split. This yields asymptotic error control under minimal conditions. It involves, however, a one-time random split of the data. Results are sensitive to this arbitrary choice: it amounts to a `p-value lottery' and makes it difficult to reproduce results. Here, we show that inference across multiple random splits can be aggregated, while keeping asymptotic control over the inclusion of noise variables. We show that the resulting p-values can be used for control of both family-wise error (FWER) and false discovery rate (FDR). In addition, the proposed aggregation is shown to improve power while reducing the number of falsely selected variables substantially.

multi-split method, procedure, single-split method, (13 more...)

arXiv.org Machine Learning

0811.2177

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > Experimental Study (0.95)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Large-Margin kNN Classification Using a Deep Encoder Network

Min, Martin Renqiang, Stanley, David A., Yuan, Zineng, Bonner, Anthony, Zhang, Zhaolei

arXiv.org Artificial IntelligenceJun-9-2009

KNN is one of the most popular classification methods, but it often fails to work well with inappropriate choice of distance metric or due to the presence of numerous class-irrelevant features. Linear feature transformation methods have been widely applied to extract class-relevant information to improve kNN classification, which is very limited in many applications. Kernels have been used to learn powerful non-linear feature transformations, but these methods fail to scale to large datasets. In this paper, we present a scalable non-linear feature mapping method based on a deep neural network pretrained with restricted boltzmann machines for improving kNN classification in a large-margin framework, which we call DNet-kNN. DNet-kNN can be used for both classification and for supervised dimensionality reduction. The experimental results on two benchmark handwritten digit datasets show that DNet-kNN has much better performance than large-margin kNN using a linear mapping and kNN based on a deep autoencoder pretrained with retricted boltzmann machines.

artificial intelligence, classification, machine learning, (14 more...)

arXiv.org Artificial Intelligence

0906.1814

Country: North America > Canada > Ontario > Toronto (0.30)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Exponential Family Graph Matching and Ranking

Petterson, James, Caetano, Tiberio, McAuley, Julian, Yu, Jin

arXiv.org Artificial IntelligenceJun-4-2009

We present a method for learning max-weight matching predictors in bipartite graphs. The method consists of performing maximum a posteriori estimation in exponential families with sufficient statistics that encode permutations and data features. Although inference is in general hard, we show that for one very relevant application - web page ranking - exact inference is efficient. For general model instances, an appropriate sampler is readily available. Contrary to existing max-margin matching models, our approach is statistically consistent and, in addition, experiments with increasing sample sizes indicate superior improvement over such models. We apply the method to graph matching in computer vision as well as to a standard benchmark dataset for learning web page ranking, in which we obtain state-of-the-art results, in particular improving on max-margin variants. The drawback of this method with respect to max-margin alternatives is its runtime for large graphs, which is comparatively high.

artificial intelligence, inductive learning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

0904.2623

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

Learning Nonlinear Dynamic Models

Langford, John, Salakhutdinov, Ruslan, Zhang, Tong

arXiv.org Artificial IntelligenceJun-3-2009

We present a novel approach for learning nonlinear dynamic models, which leads to a new set of tools capable of solving problems that are otherwise difficult. We provide theory showing this new approach is consistent for models with long range structure, and apply the approach to motion capture and high-dimensional video data, yielding results superior to standard alternatives.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

0905.3369

Country:

North America > United States (0.46)
North America > Canada > Ontario (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.33)

Add feedback