low-dimensional
Consensus dimension reduction via multi-view learning
Dimension reduction methods are a fundamental class of techniques in data analysis, which aim to find a lower-dimensional representation of higher-dimensional data while preserving as much of the original information as possible. These methods are extensively used in practice, including in exploratory data analyses to visualize data--arguably, one of the first and most vital steps in any data analysis (Ray et al., 2021). Notably, in genomics, dimension reduction methods are ubiquitously applied to visualize high-dimensional single-cell RNA sequencing data in two dimensions (Becht et al., 2019). Beyond visualization, dimension reduction methods are also frequently employed to mitigate the curse of dimensionality (Bellman, 1957), engineer new features to improve downstream tasks like prediction (e.g., Massy, 1965), and enable scientific discovery in unsupervised learning settings (Chang et al., 2025). For example, many researchers have used dimension reduction in conjunction with clustering to discover new cell types and cell states (Wu et al., 2021), new cancer subtypes (Northcott et al., 2017), and other substantively-meaningful structure in a variety of domains (Bergen et al., 2019; Traven et al., 2017). Given the widespread use and need for dimension reduction methods, numerous dimension reduction techniques have been developed. Popular techniques include but are not limited to principal component analysis (PCA) (Pearson, 1901; Hotelling, 1933), multidimensional scaling (MDS) (Torgerson, 1952; Kruskal, 1964a), Isomap (Tenenbaum et al., 2000), locally linear embedding (LLE) (Roweis and Saul, 2000), t-distributed stochastic neighbor embedding (t-SNE) (van der 1
Low-dimensional embeddings of high-dimensional data
de Bodt, Cyril, Diaz-Papkovich, Alex, Bleher, Michael, Bunte, Kerstin, Coupette, Corinna, Damrich, Sebastian, Sanmartin, Enrique Fita, Hamprecht, Fred A., Horvát, Emőke-Ágnes, Kohli, Dhruv, Krishnaswamy, Smita, Lee, John A., Lelieveldt, Boudewijn P. F., McInnes, Leland, Nabney, Ian T., Noichl, Maximilian, Poličar, Pavlin G., Rieck, Bastian, Wolf, Guy, Mishne, Gal, Kobak, Dmitry
Large collections of high-dimensional data have become nearly ubiquitous across many academic fields and application domains, ranging from biology to the humanities. Since working directly with high-dimensional data poses challenges, the demand for algorithms that create low-dimensional representations, or embeddings, for data visualization, exploration, and analysis is now greater than ever. In recent years, numerous embedding algorithms have been developed, and their usage has become widespread in research and industry. This surge of interest has resulted in a large and fragmented research field that faces technical challenges alongside fundamental debates, and it has left practitioners without clear guidance on how to effectively employ existing methods. Aiming to increase coherence and facilitate future work, in this review we provide a detailed and critical overview of recent developments, derive a list of best practices for creating and using low-dimensional embeddings, evaluate popular approaches on a variety of datasets, and discuss the remaining challenges and open problems in the field.
- Europe > France (0.04)
- Europe > Belgium > Wallonia > Namur Province > Namur (0.04)
- Asia > Middle East > Israel (0.04)
- (17 more...)
- Overview (0.86)
- Research Report (0.81)