AITopics | data dimensionality

Collaborating Authors

data dimensionality

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Generative artificial intelligence is now a widely used tool in molecular science. Despite the popularity of probabilistic generative models, numerical experiments benchmarking their performance on molecular data are lacking. In this work, we introduce and explain several classes of generative models, broadly sorted into two categories: flow-based models and diffusion models. We select three representative models: Neural Spline Flows, Conditional Flow Matching, and Denoising Diffusion Probabilistic Models, and examine their accuracy, computational cost, and generation speed across datasets with tunable dimensionality, complexity, and modal asymmetry. Our findings are varied, with no one framework being the best for all purposes. In a nutshell, (i) Neural Spline Flows do best at capturing mode asymmetry present in low-dimensional data, (ii) Conditional Flow Matching outperforms other models for high-dimensional data with low complexity, and (iii) Denoising Diffusion Probabilistic Models appears the best for low-dimensional data with high complexity. Our datasets include a Gaussian mixture model and the dihedral torsion angle distribution of the Aib\textsubscript{9} peptide, generated via a molecular dynamics simulation. We hope our taxonomy of probabilistic generative frameworks and numerical results may guide model selection for a wide range of molecular tasks.

dataset, dimensionality, free energy difference, (14 more...)

arXiv.org Artificial Intelligence

2411.09388

Country:

North America > United States > Maryland > Prince George's County > College Park (0.15)
North America > United States > Maryland > Montgomery County (0.04)
North America > United States > Maryland > Baltimore (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report (0.84)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

Approximate UMAP allows for high-rate online visualization of high-dimensional data streams

Wassenaar, Peter, Guetschel, Pierre, Tangermann, Michael

arXiv.org Artificial IntelligenceApr-5-2024

In the BCI field, introspection and interpretation of brain signals are desired for providing feedback or to guide rapid paradigm prototyping but are challenging due to the high noise level and dimensionality of the signals. Deep neural networks are often introspected by transforming their learned feature representations into 2- or 3-dimensional subspace visualizations using projection algorithms like Uniform Manifold Approximation and Projection (UMAP). Unfortunately, these methods are computationally expensive, making the projection of data streams in real-time a non-trivial task. In this study, we introduce a novel variant of UMAP, called approximate UMAP (aUMAP). It aims at generating rapid projections for real-time introspection. To study its suitability for real-time projecting, we benchmark the methods against standard UMAP and its neural network counterpart parametric UMAP. Our results show that approximate UMAP delivers projections that replicate the projection space of standard UMAP while decreasing projection speed by an order of magnitude and maintaining the same training time.

aumap, projection, pumap, (16 more...)

arXiv.org Artificial Intelligence

2404.04001

Country:

North America > United States > Wisconsin (0.05)
Europe > Netherlands > Gelderland > Nijmegen (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.74)

Add feedback

Dimensionality Reduction for Machine Learning

#artificialintelligenceSep-19-2021, 22:30:05 GMT

What is High Demensional Data? How does it affect your Machine Learning models? Have you ever wondered why your model isn't meeting your expectations and you have tried hyper-tuning the parameters until the ends of the earth, with no improvements? Understanding your data and your model may be key. Underneath such an immense and complicated hood, you may be concerned that there are few to no ways of gaining more insight into your data, as well as your model.

dataset, feature reduction, reduction, (12 more...)

#artificialintelligence

Country: North America > United States > Iowa > Story County > Ames (0.05)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.43)

Add feedback

Random Projection in Deep Neural Networks

Wójcik, Piotr Iwo

arXiv.org Machine LearningDec-22-2018

This work investigates the ways in which deep learning methods can benefit from random projection (RP), a classic linear dimensionality reduction method. We focus on two areas where, as we have found, employing RP techniques can improve deep models: training neural networks on high-dimensional data and initialization of network parameters. Training deep neural networks (DNNs) on sparse, high-dimensional data with no exploitable structure implies a network architecture with an input layer that has a huge number of weights, which often makes training infeasible. We show that this problem can be solved by prepending the network with an input layer whose weights are initialized with an RP matrix. We propose several modifications to the network architecture and training regime that makes it possible to efficiently train DNNs with learnable RP layer on data with as many as tens of millions of input features and training examples. In comparison to the state-of-the-art methods, neural networks with RP layer achieve competitive performance or improve the results on several extremely high-dimensional real-world datasets. The second area where the application of RP techniques can be beneficial for training deep models is weight initialization. Setting the initial weights in DNNs to elements of various RP matrices enabled us to train residual deep networks to higher levels of performance.

error backpropagation, feature selection method, initializing deep network, (14 more...)

arXiv.org Machine Learning

1812.09489

Country: