AITopics

Most of the existing metric learning methods are accomplished by exploiting pairwise constraints over the labeled data and frequently suffer from the insufficiency of training examples. To learn a robust distance metric from few labeled examples, prior knowledge from unlabeled examples as well as the metrics previously derived from auxiliary data sets can be useful. In this paper, we propose to leverage such auxiliary knowledge to assist distance metric learning, which is formulated following the regularized loss minimization principle. Two algorithms are derived on the basis of manifold regularization and log-determinant divergence regularization technique, respectively, which can simultaneously exploit label information (i.e., the pairwise constraints over labeled data), unlabeled examples, and the metrics derived from auxiliary data sets. The proposed methods directly manipulate the auxiliary metrics and require no raw examples from the auxiliary data sets, which make them efficient and flexible. We conduct extensive evaluations to compare our approaches with a number of competing approaches on face recognition task. The experimental results show that our approaches can derive reliable distance metrics from limited training examples and thus are superior in terms of accuracy and labeling efforts.

distance metric, learning, metric learning, (13 more...)

Twenty-First International Joint Conference on Artificial Intelligence

Country:

North America > United States > Michigan (0.04)
Asia > China > Anhui Province > Hefei (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)

Yin, Jie (CSIRO ICT Centre) | Hu, Derek Hao (Hong Kong University of Science and Technology) | Yang, Qiang (Hong Kong University of Science and Technology)

Spatio-Temporal Event Detection Using Dynamic Conditional Random Fields

Event detection is a critical task in sensor networks for a variety of real-world applications. Many real-world events often exhibit complex spatio-temporal patterns whereby they manifest themselves via observations over time and space proximities. These spatio-temporal events cannot be handled well by many of the previous approaches. In this paper, we propose a new Spatio-Temporal Event Detection (STED) algorithm in sensor networks based on a dynamic conditional random field (DCRF) model. Our STED method handles the uncertainty of sensor data explicitly and permits neighborhood interactions in both observations and event labels. Experiments on both real data and synthetic data demonstrate that our STED method can provide accurate event detection in near real time even for large-scale sensor networks.

algorithm, detection, event detection, (14 more...)

Twenty-First International Joint Conference on Artificial Intelligence

Country:

Asia > China > Hong Kong (0.04)
Oceania > Australia (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(10 more...)

Industry: Materials > Metals & Mining (0.46)

Technology:

Information Technology > Communications > Networks > Sensor Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Multi-Relational Learning with Gaussian Processes

Xu, Zhao (Fraunhofer IAIS) | Kersting, Kristian (Fraunhofer IAIS) | Tresp, Volker (Siemens Corporate Technology)

Due to their flexible nonparametric nature, Gaussian process models are very effective at solving hard machine learning problems. While existing Gaussian process models focus on modeling one single relation, we present a generalized GP model, named multi-relational Gaussian process model, that is able to deal with an arbitrary number of relations in a domain of interest. The proposed model is analyzed in the context of bipartite, directed, and undirected univariate relations. Experimental results on real-world datasets show that exploiting the correlations among different entity types and relations can indeed improve prediction performance.

gaussian process, latent variable, relation, (14 more...)

Twenty-First International Joint Conference on Artificial Intelligence

Country: Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Industry: Media > Film (0.30)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Discriminative Semi-Supervised Feature Selection via Manifold Regularization

Xu, Zenglin (The Chinese University of Hong Kong) | Jin, Rong (Michigan State University) | Lyu, Michael R. (The Chinese University of Hong Kong) | King, Irwin (The Chinese University of Hong Kong)

Feature selection can be conducted in a supervised or unsupervised manner, in terms of whether the label information We consider the problem of semi-supervised feature is utilized to guide the selection of relevant features. Generally, selection, where we are given a small amount supervised feature selection methods require a large of labeled examples and a large amount of unlabeled amount of labeled training data. It however could fail to identify examples. Since a small number of labeled the relevant features that are discriminative to different samples are usually insufficient for identifying the classes, provided the number of labeled samples is small. On relevant features, the critical problem arising from the other hand, while unsupervised feature selection methods semi-supervised feature selection is how to take could work well with unlabeled training data, they ignore advantage of the information underneath the unlabeled the label information and therefore are often unable to identify data. To address this problem, we propose the discriminative features. Given the high cost in manually a novel discriminative semi-supervised feature labeling data, and at the same time abundant unlabeled selection method based on the idea of manifold data are often easily accessible, it is desirable to develop feature regularization. The proposed method selects selection methods that are capable of exploiting both labeled features through maximizing the classification margin and unlabeled data.

feature selection, selection, selection method, (14 more...)

Twenty-First International Joint Conference on Artificial Intelligence

Country:

Asia > China > Hong Kong (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Michigan > Ingham County > Lansing (0.04)
(3 more...)

Industry: Government > Regional Government > North America Government > United States Government (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Vahdatpour, Alireza (University of California, Los Angeles) | Amini, Navid (University of California, Los Angeles) | Sarrafzadeh, Majid (University of California, Los Angeles)

Toward Unsupervised Activity Discovery Using Multi Dimensional Motif Detection in Time Series

This paper addresses the problem of activity and event discovery in multi dimensional time series data by proposing a novel method for locating multi dimensional motifs in time series. While recent work has been done in finding single dimensional and multi dimensional motifs in time series, we address motifs in general case, where the elements of multi dimensional motifs have temporal, length, and frequency variations. The proposed method is validated by synthetic data, and empirical evaluation has been done on several wearable systems that are used by real subjects.

algorithm, motif, time sery, (14 more...)

Twenty-First International Joint Conference on Artificial Intelligence

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Belgium (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Data Science > Data Mining (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Succinct Approximate Counting of Skewed Data

Talbot, David (Google Inc.)

Practical data analysis relies on the ability to count observations of objectssuccinctly and efficiently. Unfortunately the space usage of an exact estimator grows with the size of the a priori set from which objects are drawn while the time required to maintain such an estimator grows with the size of the data set. We present static and on-line approximation schemes that avoid these limitations when approximate frequency estimates are acceptable. Our Log-Frequency Sketch extends the approximate counting algorithm of Morris [Morris1978] to estimate frequencies with bounded relative error via a single pass over a data set. It uses constant space per object when the frequencies follow a power law and can be maintained in constant time per observation. We give an (epsilon, delta)-approximation scheme which we verify empirically on a large natural language data set where, for instance, 95 percent of frequencies are estimated with relative error less than 0.25 using fewer than 11 bits per object in the static case and 15 bits per object on-line.

frequency, hash function, relative error, (16 more...)

Twenty-First International Joint Conference on Artificial Intelligence

Country:

North America > United States > California > Santa Clara County > Mountain View (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Czechia > Prague (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Latent Variable Perceptron Algorithm for Structured Classification

Sun, Xu (University of Tokyo) | Matsuzaki, Takuya (University of Tokyo) | Okanohara, Daisuke (University of Tokyo) | Tsujii, Jun' (University of Tokyo) | ichi

We propose a perceptron-style algorithm for fast discriminative training of structured latent variable model. This method extends the perceptron algorithm for the learning with latent dependencies, as an alternative to existing probabilistic latent variable models. It relies on Viterbi decoding over latent variables, combined with simple additive updates. Its training cost is significantly lower than that of probabilistic latent variable models, while it gives comparable or even superior classification accuracy on our tasks. Experiments on natural language processing problems demonstrate that its results are among those good reports on corresponding data sets.

latent perceptron, latent variable, perceptron, (15 more...)

Twenty-First International Joint Conference on Artificial Intelligence

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

On the Equivalence Between Canonical Correlation Analysis and Orthonormalized Partial Least Squares

Sun, Liang (Arizona State University) | Ji, Shuiwang (Arizona State University) | Yu, Shipeng (Siemens Medical Solutions USA, Inc.) | Ye, Jieping (Arizona State University)

Canonical correlation analysis (CCA) and partial least squares (PLS) are well-known techniques for feature extraction from two sets of multi-dimensional variables. The fundamental difference between CCA and PLS is that CCA maximizes the correlation while PLS maximizes the covariance. Although both CCA and PLS have been applied successfully in various applications, the intrinsic relationship between them remains unclear. In this paper, we attempt to address this issue by showing the equivalence relationship between CCA and orthonormalized partial least squares (OPLS), a variant of PLS. We further extend the equivalence relationship to the case when regularization is employed for both sets of variables. In addition, we show that the CCA projection for one set of variables is independent of the regularization on the other set of variables. We have performed experimental studies using both synthetic and real data sets and our results confirm the established equivalence relationship. The presented analysis provides novel insights into the connection between these two existing algorithms as well as the effect of the regularization.

cca and opl, equivalence relationship, regularization, (10 more...)

Twenty-First International Joint Conference on Artificial Intelligence

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Arizona (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Sprague, Nathan (Kalamazoo College)

Predictive Projections

These existing algorithms discover projections policies in very high dimensional state spaces. of the training data under which nearby points are likely We propose a linear dimensionality reduction algorithm to have the same class label or similar regression targets. The that discovers predictive projections: projections algorithm described in this paper makes use of the same machinery in which accurate predictions of future states but attempts to find low-dimensional projections under can be made using simple nearest neighbor style which current state vectors accurately predict future states learning. The goal of this work is to extend the in the projected space. The intuition is that projections which reach of existing reinforcement learning algorithms capture the state dynamics in this way are likely to contain to domains where they would otherwise be inapplicable information that will be useful for control.

algorithm, projection, projection algorithm, (16 more...)

Twenty-First International Joint Conference on Artificial Intelligence

Country:

North America > United States > Michigan > Kalamazoo County > Kalamazoo (0.04)
North America > Puerto Rico (0.04)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Rai, Piyush (School of Computing, University of Utah) | Daume, Hal (School of Computing, University of Utah) | Venkatasubramanian, Suresh (School of Computing, University of Utah)

Streamed Learning: One-Pass SVMs

We present a streaming model for large-scale classification (in the context of ℓ2 -SVM) by leveraging connections between learning and computational geometry. The streaming model imposes the constraint that only a single pass over the data is allowed. The ℓ2 -SVM is known to have an equivalent formulation in terms of the minimum enclosing ball (MEB) problem, and an efficient algorithm based on the idea of core sets exists (CVM) [Tsang et al., 2005]. CVM learns a (1 + ε)-approximate MEB for a set of points and yields an approximate solution to corresponding SVM instance. However CVM works in batch mode requiring multiple passes over the data. This paper presents a single-pass SVM which is based on the minimum enclosing ball of streaming data. We show that the MEB updates for the streaming case can be easily adapted to learn the SVM weight vector in a way similar to using online stochastic gradient updates. Our algorithm performs polylogarithmic computation at each example, and requires very small and constant storage. Experimental results show that, even in such restrictive settings, we can learn efficiently in just one pass and get accuracies comparable to other state-of-the-art SVM solvers (batch and online). We also give an analysis of the algorithm, and discuss some open issues and possible extensions.

algorithm, lookahead, streamsvm, (13 more...)

Twenty-First International Joint Conference on Artificial Intelligence

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > United States > Utah (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.87)