AITopics

0902.1970

Genre: Research Report > Experimental Study (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

McWilliams, Brian, Montana, Giovanni

Sparse partial least squares for on-line variable selection in multivariate data streams

arXiv.org Machine LearningFeb-8-2009

In this paper we propose a computationally efficient algorithm for on-line variable selection in multivariate regression problems involving high dimensional data streams. The algorithm recursively extracts all the latent factors of a partial least squares solution and selects the most important variables for each factor. This is achieved by means of only one sparse singular value decomposition which can be efficiently updated on-line and in an adaptive fashion. Simulation results based on artificial data streams demonstrate that the algorithm is able to select important variables in dynamic settings where the correlation structure among the observed streams is governed by a few hidden components and the importance of each variable changes over time. We also report on an application of our algorithm to a multivariate version of the "enhanced index tracking" problem using financial data streams. The application consists of performing on-line asset allocation with the objective of overperforming two benchmark indices simultaneously.

algorithm, artificial intelligence, machine learning, (16 more...)

0902.1323

Country: Europe (0.28)

Genre: Research Report (0.82)

Industry: Banking & Finance > Trading (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

Maeno, Yoshiharu, Ohsawa, Yukio

Reflective visualization and verbalization of unconscious preference

arXiv.org Artificial IntelligenceFeb-2-2009

A new method is presented, that can help a person become aware of his or her unconscious preferences, and convey them to others in the form of verbal explanation. The method combines the concepts of reflection, visualization, and verbalization. The method was tested in an experiment where the unconscious preferences of the subjects for various artworks were investigated. In the experiment, two lessons were learned. The first is that it helps the subjects become aware of their unconscious preferences to verbalize weak preferences as compared with strong preferences through discussion over preference diagrams. The second is that it is effective to introduce an adjustable factor into visualization to adapt to the differences in the subjects and to foster their mutual understanding.

artificial intelligence, data mining, machine learning, (19 more...)

doi: 10.1504/IJAIP.2010.030531

0803.4074

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.15)

Genre: Research Report (0.40)

Industry: Education (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)
Information Technology > Data Science > Data Mining (0.68)

arXiv.org Machine LearningJan-29-2009

Separating populations with wide data: A spectral analysis

Blum, Avrim, Coja-Oghlan, Amin, Frieze, Alan, Zhou, Shuheng

In this paper, we consider the problem of partitioning a small data sample drawn from a mixture of $k$ product distributions. We are interested in the case that individual features are of low average quality $\gamma$, and we want to use as few of them as possible to correctly partition the sample. We analyze a spectral technique that is able to approximately optimize the total data size--the product of number of data points $n$ and the number of features $K$--needed to correctly perform this partitioning as a function of $1/\gamma$ for $K>n$. Our goal is motivated by an application in clustering individuals according to their population of origin using markers, when the divergence between any two of the populations is small.

artificial intelligence, machine learning, separating population, (17 more...)

doi: 10.1214/08-EJS289

0706.3434

Country:

Europe (0.45)
North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Fournier-Viger, P., Nkambou, R., Nguifo, E. Mephu

A Knowledge Discovery Framework for Learning Task Models from User Interactions in Intelligent Tutoring Systems

arXiv.org Artificial IntelligenceJan-29-2009

Domain experts should provide relevant domain knowledge to an Intelligent Tutoring System (ITS) so that it can guide a learner during problemsolving learning activities. However, for many ill-defined domains, the domain knowledge is hard to define explicitly. In previous works, we showed how sequential pattern mining can be used to extract a partial problem space from logged user interactions, and how it can support tutoring services during problem-solving exercises. This article describes an extension of this approach to extract a problem space that is richer and more adapted for supporting tutoring services. We combined sequential pattern mining with (1) dimensional pattern mining (2) time intervals, (3) the automatic clustering of valued actions and (4) closed sequences mining. Some tutoring services have been implemented and an experiment has been conducted in a tutoring system.

artificial intelligence, machine learning, pattern recognition, (17 more...)

doi: 10.1007/978-3-540-88636-5

0901.4761

Country: North America > Canada (0.28)

Genre: Research Report (0.64)

Industry: Education > Educational Technology > Educational Software > Computer Based Training (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)

Chatpatanasiri, Ratthachat, Korsrilabutr, Teesid, Tangchanachaianan, Pasakorn, Kijsirikul, Boonserm

On Kernelization of Supervised Mahalanobis Distance Learners

arXiv.org Artificial IntelligenceJan-29-2009

This paper focuses on the problem of kernelizing an existing supervised Mahalanobis distance learner. The following features are included in the paper. Firstly, three popular learners, namely, "neighborhood component analysis", "large margin nearest neighbors" and "discriminant neighborhood embedding", which do not have kernel versions are kernelized in order to improve their classification performances. Secondly, an alternative kernelization framework called "KPCA trick" is presented. Implementing a learner in the new framework gains several advantages over the standard framework, e.g. no mathematical formulas and no reprogramming are required for a kernel implementation, the framework avoids troublesome problems such as singularity, etc. Thirdly, while the truths of representer theorems are just assumptions in previous papers related to ours, here, representer theorems are formally proven. The proofs validate both the kernel trick and the KPCA trick in the context of Mahalanobis distance learning. Fourthly, unlike previous works which always apply brute force methods to select a kernel, we investigate two approaches which can be efficiently adopted to construct an appropriate kernel for a given dataset. Finally, numerical results on various real-world datasets are presented.

artificial intelligence, machine learning, theorem, (18 more...)

0804.1441

Genre: Research Report (0.50)

Industry: Education > Educational Setting > Online (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.89)

Journal of Artificial Intelligence ResearchJan-23-2009

Interactive Policy Learning through Confidence-Based Autonomy

Chernova, S., Veloso, M.

The CBA algorithm consists of two components which take advantage of the complimentary abilities of humans and computer agents. The first component, Confident Execution, enables the agent to identify states in which demonstration is required, to request a demonstration from the human teacher and to learn a policy based on the acquired data. The algorithm selects demonstrations based on a measure of action selection confidence, and our results show that using Confident Execution the agent requires fewer demonstrations to learn the policy than when demonstrations are selected by a human teacher. The second algorithmic component, Corrective Demonstration, enables the teacher to correct any mistakes made by the agent through additional demonstrations in order to improve the policy and future task performance. CBA and its individual components are compared and evaluated in a complex simulated driving domain.

agent, algorithm, demonstration, (12 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.2584

AI Access Foundation

10584

Journal of Artificial Intelligence Research

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre:

Research Report > New Finding (0.54)
Overview (0.46)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)

arXiv.org Artificial IntelligenceJan-22-2009

Learning Low-Density Separators

Ben-David, Shai, Lu, Tyler, Pal, David, Sotakova, Miroslava

We define a novel, basic, unsupervised learning problem - learning the lowest density homogeneous hyperplane separator of an unknown probability distribution. This task is relevant to several problems in machine learning, such as semi-supervised learning and clustering stability. We investigate the question of existence of a universally consistent algorithm for this problem. We propose two natural learning paradigms and prove that, on input unlabeled random samples generated by any member of a rich family of distributions, they are guaranteed to converge to the optimal separator for that distribution. We complement this result by showing that no learning algorithm for our task can achieve uniform learning rates (that are independent of the data generating distribution).

algorithm, artificial intelligence, machine learning, (18 more...)

0805.2891

Country:

Europe (0.28)
North America (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

arXiv.org Machine LearningJan-21-2009

Model-Consistent Sparse Estimation through the Bootstrap

Bach, Francis

We consider the least-square linear regression problem with regularization by the $\ell^1$-norm, a problem usually referred to as the Lasso. In this paper, we first present a detailed asymptotic analysis of model consistency of the Lasso in low-dimensional settings. For various decays of the regularization parameter, we compute asymptotic equivalents of the probability of correct model selection. For a specific rate decay, we show that the Lasso selects all the variables that should enter the model with probability tending to one exponentially fast, while it selects all other variables with strictly positive probability. We show that this property implies that if we run the Lasso for several bootstrapped replications of a given sample, then intersecting the supports of the Lasso bootstrap estimates leads to consistent model selection. This novel variable selection procedure, referred to as the Bolasso, is extended to high-dimensional settings by a provably consistent two-step procedure.

artificial intelligence, machine learning, probability, (16 more...)

0901.3202

Country: Europe (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Chen, Guangliang, Lerman, Gilad

Foundations of a Multi-way Spectral Clustering Framework for Hybrid Linear Modeling

arXiv.org Machine LearningJan-14-2009

The problem of Hybrid Linear Modeling (HLM) is to model and segment data using a mixture of affine subspaces. Different strategies have been proposed to solve this problem, however, rigorous analysis justifying their performance is missing. This paper suggests the Theoretical Spectral Curvature Clustering (TSCC) algorithm for solving the HLM problem, and provides careful analysis to justify it. The TSCC algorithm is practically a combination of Govindu's multi-way spectral clustering framework (CVPR 2005) and Ng et al.'s spectral clustering algorithm (NIPS 2001). The main result of this paper states that if the given data is sampled from a mixture of distributions concentrated around affine subspaces, then with high sampling probability the TSCC algorithm segments well the different underlying clusters. The goodness of clustering depends on the within-cluster errors, the between-clusters interaction, and a tuning parameter applied by TSCC. The proof also provides new insights for the analysis of Ng et al. (NIPS 2001).

artificial intelligence, equation, machine learning, (18 more...)

doi: 10.1007/s10208-009-9043-7

0810.3724

Country: North America > United States > Minnesota (0.28)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)