isomap
6d3a2d24eb109dddf78374fe5d0ee067-AuthorFeedback.pdf
We thank the reviewers for their constructive feedback and address their comments below. In this paper, we focus on the models with low memory budgets. Empirically, we also observe that edge probabilities converge to 0 or 1. Y es, in our model edge indicators are independent random variables. Furthermore, PRODIGE is a general method that works for a variety of tasks (e.g. If accepted, we will include a more detailed comparison of the two methods with explanation.
Cluster and then Embed: A Modular Approach for Visualization
Coda, Elizabeth, Arias-Castro, Ery, Mishne, Gal
Dimensionality reduction methods such as t-SNE and UMAP are popular methods for visualizing data with a potential (latent) clustered structure. They are known to group data points at the same time as they embed them, resulting in visualizations with well-separated clusters that preserve local information well. However, t-SNE and UMAP also tend to distort the global geometry of the underlying data. We propose a more transparent, modular approach consisting of first clustering the data, then embedding each cluster, and finally aligning the clusters to obtain a global embedding. We demonstrate this approach on several synthetic and real-world datasets and show that it is competitive with existing methods, while being much more transparent.
- Europe > Netherlands > South Holland > Leiden (0.05)
- North America > United States > California > San Diego County > San Diego (0.04)
- Asia > Middle East > Jordan (0.04)
Matrix factorisation and the interpretation of geodesic distance
Given a graph or similarity matrix, we consider the problem of recovering a notion of true distance between the nodes, and so their true positions. We show that this can be accomplished in two steps: matrix factorisation, followed by nonlinear dimension reduction. This combination is effective because the point cloud obtained in the first step lives close to a manifold in which latent distance is encoded as geodesic distance.
- North America > United States > California > Los Angeles County > Los Angeles (0.05)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > Spain > Galicia > Madrid (0.04)
- (7 more...)
- Information Technology (0.93)
- Health & Medicine > Therapeutic Area (0.46)
Export Reviews, Discussions, Author Feedback and Meta-Reviews
We thank the reviewers for their positive feedback on our paper. Rev 2 and 5 asked the difference between LL-LVM and GP-LVM: As noted in lines 65-68 in introduction, both are Gaussian models, but GP-LVM is parameterised by a stationary covariance function rather than by a local neighbour-defined precision matrix which seeks to explicitly preserve local neighbourhood properties. Rev 1 We will add the complexity analysis. An expensive step in the E-step is the inversion of the posterior precision matrices of both x and C (in eq. We will study possible ways to decrease the complexity in future work.
Reviews: Manifold denoising by Nonlinear Robust Principal Component Analysis
After rebuttal: I would like to thank the authors for addressing the comments. Their responses helped clarify some of my questions and helped better understand the paper, and therefore I am glad to increase my score. I am glad that the authors have decided to add the derivations in Sect. There are a few other things that I hope the authors will address for the final version of the paper: 1. Limitations of the method 2. Short introduction to RPCA 3. Related work as mentioned in the initial review. The discussion on kNN vs \epsNN on stability from the authors' response is very helpful and would be useful to add it to the paper, otherwise it's still not clear why wouldn't the method use all the points in \epsNN (within radius r1).
Reviews: Beyond Vector Spaces: Compact Data Representation as Differentiable Weighted Graphs
Except the presence of each edge is probabilistic than deterministic, the core idea is quite similar to Isomap. The novelty should be better addressed by comparing to Isomap. For example, edges between words that frequently co-occur in the same contexts are not independent to each other. Edges between pixels in small coherent regions are not independent. Do we eventually need to know such dependency structures a priori to correctly represent arbitrary geometry in the data?
A dimensionality reduction technique based on the Gromov-Wasserstein distance
Eufrazio, Rafael P., Montesuma, Eduardo Fernandes, Cavalcante, Charles C.
Analyzing relationships between objects is a pivotal problem within data science. In this context, Dimensionality reduction (DR) techniques are employed to generate smaller and more manageable data representations. This paper proposes a new method for dimensionality reduction, based on optimal transportation theory and the Gromov-Wasserstein distance. We offer a new probabilistic view of the classical Multidimensional Scaling (MDS) algorithm and the nonlinear dimensionality reduction algorithm, Isomap (Isometric Mapping or Isometric Feature Mapping) that extends the classical MDS, in which we use the Gromov-Wasserstein distance between the probability measure of high-dimensional data, and its low-dimensional representation. Through gradient descent, our method embeds high-dimensional data into a lower-dimensional space, providing a robust and efficient solution for analyzing complex high-dimensional datasets.
Investigating Privacy Leakage in Dimensionality Reduction Methods via Reconstruction Attack
Lumbut, Chayadon, Ponnoprat, Donlapark
Machine Learning (ML) models have become essential tools for solving complex real-world problems across various domains, including image processing, natural language processing, and business analytics. However, learning from high-dimensional data can be difficult due to the curse of dimensionality and increased computational requirements. To address these issues, dimensionality reduction methods are employed in order to reduce training costs and improve its efficiency. Popular dimensionality reduction methods include principal component analysis, t-SNE [vdMH08], and UMAP [MHM18]. These methods aim to reduce data dimensions while preserving global and local properties of the original data, ensuring that relationships between data points in higher dimensions are still reflected in lower-dimensional representations. The information retained from the original data is crucial for effective data analysis and visualization.
- North America > United States > California > San Diego County > San Diego (0.04)
- Asia > Thailand > Chiang Mai > Chiang Mai (0.04)
- North America > United States > California > Santa Clara County > Santa Clara (0.04)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)