AITopics | dimension reduction technique

Collaborating Authors

dimension reduction technique

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Dimensionally Reduced Open-World Clustering: DROWCULA

Ozbey, Erencem, Diochnos, Dimitrios I.

arXiv.org Artificial IntelligenceSep-10-2025

Working with annotated data is the cornerstone of supervised learning. Nevertheless, providing labels to instances is a task that requires significant human effort. Several critical real-world applications make things more complicated because no matter how many labels may have been identified in a task of interest, it could be the case that examples corresponding to novel classes may appear in the future. Not unsurprisingly, prior work in this, so-called, 'open-world' context has focused a lot on semi-supervised approaches. Focusing on image classification, somehow paradoxically, we propose a fully unsupervised approach to the problem of determining the novel categories in a particular dataset. Our approach relies on estimating the number of clusters using Vision Transformers, which utilize attention mechanisms to generate vector embeddings. Furthermore, we incorporate manifold learning techniques to refine these embeddings by exploiting the intrinsic geometry of the data, thereby enhancing the overall image clustering performance. Overall, we establish new State-of-the-Art results on single-modal clustering and Novel Class Discovery on CIFAR-10, CIFAR-100, ImageNet-100, and Tiny ImageNet. We do so, both when the number of clusters is known or unknown ahead of time.

algorithm, artificial intelligence, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2509.07184

Country: North America > United States > Oklahoma (0.28)

Genre: Research Report (0.82)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

EmbedOR: Provable Cluster-Preserving Visualizations with Curvature-Based Stochastic Neighbor Embeddings

Saidi, Tristan Luca, Hickok, Abigail, Rieck, Bastian, Blumberg, Andrew J.

arXiv.org Artificial IntelligenceSep-5-2025

Stochastic Neighbor Embedding (SNE) algorithms like UMAP and tSNE often produce visualizations that do not preserve the geometry of noisy and high dimensional data. In particular, they can spuriously separate connected components of the underlying data submanifold and can fail to find clusters in well-clusterable data. To address these limitations, we propose EmbedOR, a SNE algorithm that incorporates discrete graph curvature. Our algorithm stochastically embeds the data using a curvature-enhanced distance metric that emphasizes underlying cluster structure. Critically, we prove that the EmbedOR distance metric extends consistency results for tSNE to a much broader class of datasets. We also describe extensive experiments on synthetic and real data that demonstrate the visualization and geometry-preservation capabilities of EmbedOR. We find that, unlike other SNE algorithms and UMAP, EmbedOR is much less likely to fragment continuous, high-density regions of the data. Finally, we demonstrate that the EmbedOR distance metric can be used as a tool to annotate existing visualizations to identify fragmentation and provide deeper insight into the underlying geometry of the data.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2509.03703

Genre:

Research Report > Experimental Study (0.46)
Research Report > New Finding (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.67)

Add feedback

Forward-Cooperation-Backward (FCB) learning in a Multi-Encoding Uni-Decoding neural network architecture

Dutta, Prasun, Ghosh, Koustab, De, Rajat K.

arXiv.org Artificial IntelligenceFeb-27-2025

The most popular technique to train a neural network is backpropagation. Recently, the Forward-Forward technique has also been introduced for certain learning tasks. However, in real life, human learning does not follow any of these techniques exclusively. The way a human learns is basically a combination of forward learning, backward propagation and cooperation. Humans start learning a new concept by themselves and try to refine their understanding hierarchically during which they might come across several doubts. The most common approach to doubt solving is a discussion with peers, which can be called cooperation. Cooperation/discussion/knowledge sharing among peers is one of the most important steps of learning that humans follow. However, there might still be a few doubts even after the discussion. Then the difference between the understanding of the concept and the original literature is identified and minimized over several revisions. Inspired by this, the paper introduces Forward-Cooperation-Backward (FCB) learning in a deep neural network framework mimicking the human nature of learning a new concept. A novel deep neural network architecture, called Multi Encoding Uni Decoding neural network model, has been designed which learns using the notion of FCB. A special lateral synaptic connection has also been introduced to realize cooperation. The models have been justified in terms of their performance in dimension reduction on four popular datasets. The ability to preserve the granular properties of data in low-rank embedding has been tested to justify the quality of dimension reduction. For downstream analyses, classification has also been performed. An experimental study on convergence analysis has been performed to establish the efficacy of the FCB learning strategy.

algorithm, dataset, meud-ff-coop, (14 more...)

arXiv.org Artificial Intelligence

2502.20113

Country:

North America > United States (0.14)
Asia > India > West Bengal > Kolkata (0.04)
Africa > Mali (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Scalable Methods for Nonnegative Matrix Factorizations of Near-separable Tall-and-skinny Matrices

Austin R. Benson, Jason D. Lee, Bartek Rajwa, David F. Gleich

Neural Information Processing SystemsFeb-9-2025, 21:54:19 GMT

Numerous algorithms are used for nonnegative matrix factorization under the assumption that the matrix is nearly separable. In this paper, we show how to make these algorithms scalable for data matrices that have many more rows than columns, so-called "tall-and-skinny matrices." One key component to these improved methods is an orthogonal matrix transformation that preserves the separability of the NMF problem. Our final methods need to read the data matrix only once and are suitable for streaming, multi-core, and MapReduce architectures. We demonstrate the efficacy of these algorithms on terabyte-sized matrices from scientific computing and bioinformatics.

artificial intelligence, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.47)

Add feedback

Input Guided Multiple Deconstruction Single Reconstruction neural network models for Matrix Factorization

Dutta, Prasun, De, Rajat K.

arXiv.org Artificial IntelligenceMay-22-2024

Referring back to the original text in the course of hierarchical learning is a common human trait that ensures the right direction of learning. The models developed based on the concept of Non-negative Matrix Factorization (NMF), in this paper are inspired by this idea. They aim to deal with high-dimensional data by discovering its low rank approximation by determining a unique pair of factor matrices. The model, named Input Guided Multiple Deconstruction Single Reconstruction neural network for Non-negative Matrix Factorization (IG-MDSR-NMF), ensures the non-negativity constraints of both factors. Whereas Input Guided Multiple Deconstruction Single Reconstruction neural network for Relaxed Non-negative Matrix Factorization (IG-MDSR-RNMF) introduces a novel idea of factorization with only the basis matrix adhering to the non-negativity criteria. This relaxed version helps the model to learn more enriched low dimensional embedding of the original data matrix. The competency of preserving the local structure of data in its low rank embedding produced by both the models has been appropriately verified. The superiority of low dimensional embedding over that of the original data justifying the need for dimension reduction has been established. The primacy of both the models has also been validated by comparing their performances separately with that of nine other established dimension reduction algorithms on five popular datasets. Moreover, computational complexity of the models and convergence analysis have also been presented testifying to the supremacy of the models.

dataset, dimension reduction technique, ig-mdsr-rnmf, (11 more...)

arXiv.org Artificial Intelligence

2405.13449

Country:

Asia > India > West Bengal > Kolkata (0.04)
North America > United States > Minnesota (0.04)
Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
(2 more...)

Genre: Research Report > Promising Solution (0.34)

Industry:

Education (0.67)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)

Add feedback

Scalable Methods for Nonnegative Matrix Factorizations of Near separable Tall and skinny Matrices

Neural Information Processing SystemsMar-13-2024, 13:08:13 GMT

algorithm, factorization, matrix, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.47)

Add feedback

Non-linear dimension reduction in factor-augmented vector autoregressions

Klieber, Karin

arXiv.org Machine LearningSep-9-2023

The COVID-19 pandemic belongs to the severest health, economic and social crises in recent decades and poses the greatest challenge to the world economy since World War II. The virus has spread around the globe and paralyzed entire economic sectors and activities. For economic modeling, the COVID-19 pandemic entails dealing with huge, unprecedented outliers in datasets which adversely affect the reliability of established, mostly linear, economic models. To the detriment of those commonly used models, economic indicators and variables are prone to unanticipated movements and do not respond in the way they are supposed to. Large shifts in the level of certain variables and strong deviations from their usual paths clearly aggravate the challenge of handling large outliers within existing econometric models.

artificial intelligence, latent factor, machine learning, (16 more...)

arXiv.org Machine Learning

2309.04821

Country:

Europe > United Kingdom > England (0.14)
North America > United States > Texas (0.14)
North America > United States > Oklahoma (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Banking & Finance > Trading (1.00)
Banking & Finance > Economy (1.00)
Government > Regional Government > North America Government > United States Government (0.93)
(3 more...)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

RMFGP: Rotated Multi-fidelity Gaussian process with Dimension Reduction for High-dimensional Uncertainty Quantification

Zhang, Jiahao, Zhang, Shiqi, Lin, Guang

arXiv.org Machine LearningApr-10-2022

Multi-fidelity modelling arises in many situations in computational science and engineering world. It enables accurate inference even when only a small set of accurate data is available. Those data often come from a high-fidelity model, which is computationally expensive. By combining the realizations of the high-fidelity model with one or more low-fidelity models, the multi-fidelity method can make accurate predictions of quantities of interest. This paper proposes a new dimension reduction framework based on rotated multi-fidelity Gaussian process regression and a Bayesian active learning scheme when the available precise observations are insufficient. By drawing samples from the trained rotated multi-fidelity model, the so-called supervised dimension reduction problems can be solved following the idea of the sliced average variance estimation (SAVE) method combined with a Gaussian process regression dimension reduction technique. This general framework we develop can effectively solve high-dimensional problems while the data are insufficient for applying traditional dimension reduction methods. Moreover, a more accurate surrogate Gaussian process model of the original problem can be obtained based on our trained model. The effectiveness of the proposed rotated multi-fidelity Gaussian process(RMFGP) model is demonstrated in four numerical examples. The results show that our method has better performance in all cases and uncertainty propagation analysis is performed for last two cases involving stochastic partial differential equations.

artificial intelligence, machine learning, rmfgp, (15 more...)

arXiv.org Machine Learning

2204.04819

Country:

North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Applying PCA to Stocks

#artificialintelligenceJul-17-2021, 02:15:36 GMT

This blog post is a summary of a data science project I worked on a few months ago. This was another attempt at trying to understand hidden trends in the stock market. Hopefully results will show you how "non-linear", complex, and unpredictable the market can be. Introduction: A stock's time series can be thought of as some realization of an underlying trend with added stochasticity or "noise". So surely, for an appropriate window of time, one can group bunches of stocks that are moving with an underlying trend.

eigenstock, pca, variance, (13 more...)

#artificialintelligence

Industry: Banking & Finance > Trading (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.75)

Add feedback

CS 229 - Unsupervised Learning Cheatsheet

#artificialintelligenceMar-15-2021, 06:25:10 GMT

Motivation The goal of unsupervised learning is to find hidden patterns in unlabeled data $\{x {(1)},...,x {(m)}\}$. Jensen's inequality Let $f$ be a convex function and $X$ a random variable. Latent variables Latent variables are hidden/unobserved variables that make estimation problems difficult, and are often denoted $z$. We note $c {(i)}$ the cluster of data point $i$ and $\mu_j$ the center of cluster $j$. Algorithm After randomly initializing the cluster centroids $\mu_1,\mu_2,...,\mu_k\in\mathbb{R} n$, the $k$-means algorithm repeats the following step until convergence: Algorithm It is a clustering algorithm with an agglomerative hierarchical approach that build nested clusters in a successive manner. In an unsupervised learning setting, it is often hard to assess the performance of a model since we don't have the ground truth labels as was the case in the supervised learning setting.

algorithm, matrix, unsupervised learning cheatsheet, (8 more...)

#artificialintelligence

Country: North America > United States > California > Santa Clara County > Palo Alto (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.54)

Add feedback