Structural Similarity and Distance in Learning

Wang, Joseph, Saligrama, Venkatesh, Castañón, David A.

Oct-26-2011–arXiv.org Machine Learning

The notion of distance, or more generally, similarity between observations, is at the root of most learning algorithms such as manifold learning, unsupervised clustering, semi-supervised learning, and anomaly detection. Most methods, at a basic level, are based on some function of Euclidean distance, such as the radial basis functions prevalent in supervised classification. Graphbased learning methods employ Euclidean distances to describe local neighborhoods for observations. K-nearest neighbors and ǫ-neighborhoods based on Euclidean distances are used in manifold learning (Isomap [1] and LLE [2]), spectral clustering [3], anomaly detection algorithms [4], [5] and in label propagation algorithms for semi-supervised learning [6]. In many cases, notably sparsely sampled sets of data, Euclidean neighborhoods are not sufficient to represent underlying structure. Consider data drawn from two independent structures. Ideally, a graph would capture the structure of the data with minimal connection between observations on separate structures.

artificial intelligence, machine learning, representation, (19 more...)

arXiv.org Machine Learning

Oct-26-2011

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.15)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Statistical Learning (1.00)
  - Performance Analysis > Accuracy (0.99)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found