Most manifold learning methods consider only one similarity matrix to induce a low-dimensional manifold embedded in data space. In practice, however, we often use multiple sensors at a time so that each sensory information yields different similarity matrix derived from the same objects. In such a case, manifold integration is a desirable task, combining these similarity matrices into a compromise matrix that faithfully reflects multiple sensory information. A small number of methods exists for manifold integration, including a method based on reproducing kernel Krein space (RKKS) or DISTA-TIS, where the former is restricted to the case of only two manifolds and the latter considers a linear combination of normalized similarity matrices as a compromise matrix. In this paper we present a new manifold integration method, Markov random walk on multiple manifolds (RAMS), which integrates transition probabilities defined on each manifold to compute a compromise matrix. Numerical experiments confirm that RAMS finds more informative manifolds with a desirable projection property.
In the machine learning field, dimensionality reduction is an important task. It mitigates the undesired properties of high-dimensional spaces to facilitate classification, compression, and visualization of high-dimensional data. During the last decade, researchers proposed many new (non-linear) techniques for dimensionality reduction. Most of these techniques are based on the intuition that data lies on or near a complex low-dimensional manifold that is embedded in the high-dimensional space. New techniques for dimensionality reduction aim at identifying and extracting the manifold from the high-dimensional space. Isomap is one of widely-used low-dimensional embedding methods, where geodesic distances on a weighted graph are incorporated with the classical scaling (metric multidimensional scaling). The Isomap chooses the nearest neighbours based on the distance only which causes bridges and topological instability. In this paper, we propose a new algorithm to choose the nearest neighbours to reduce the number of short-circuit errors and hence improves the topological stability. Because at any point on the manifold, that point and its nearest neighbours form a vector subspace and the orthogonal to that subspace is orthogonal to all vectors spans the vector subspace. The prposed algorithmuses the point itself and its two nearest neighbours to find the bases of the subspace and the orthogonal to that subspace which belongs to the orthogonal complementary subspace. The proposed algorithm then adds new points to the two nearest neighbours based on the distance and the angle between each new point and the orthogonal to the subspace. The superior performance of the new algorithm in choosing the nearest neighbours is confirmed through experimental work with several datasets.
Most sketch recognition systems are accurate in recognizing either text or shape (graphic) ink strokes, but not both. Distinguishing between shape and text strokes is, therefore, a critical task in recognizing hand drawn digital ink diagrams which commonly contain many text labels and annotations. We have found the ‘entropy rate’ to be an accurate criterion of classification. We found that the entropy rate is significantly higher for text strokes compared to shape strokes and can serve as a distinguishing factor between the two. Using entropy values, our system produced a correct classification rate of 92.06% on test data belonging to diagrammatic domain for which the threshold was trained on. It also performed favorably on data for which no training examples at all were supplied.