AITopics | pca component

Collaborating Authors

pca component

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Model Merging on Loss Landscape: A Geometry Perspective

Lu, Juanwu, Bhaskar, Anand, Axelrod, Brian, Tolstaya, Ekaterina, Emrich, Tristan

arXiv.org Machine LearningMay-27-2026

Model merging offers a promising avenue for knowledge integration and parallel development without retraining. Yet, existing methods either ignore the geometry of the loss landscape or rely on intractable full-space Hessian approximations. We propose EpiMer, a framework that casts model merging as solving the Fréchet mean on a Riemannian manifold and restricts the computation to a low-rank subspace spanned by the task vectors. With the expected Hessian as the metric, we reveal a connection between local curvature and epistemic uncertainty of the parameters. Our theoretical analysis decomposes the merging error bound into the subspace Fréchet variance and the residual energy, and provides a closed-form characterization of when curvature-aware merging provably outperforms flat-geometry methods. In addition, our framework unifies both curvature-aware methods and recent spectral methods as special cases of the subspace Fréchet mean with different geometric metrics. Merging fine-tuned CLIP-ViT models on eight image classification tasks, Epistemic Merging strictly outperforms the baselines on all three CLIP-ViT backbones at matched rank, improving the across-task average accuracy and worst-task accuracy on every backbone.

artificial intelligence, epimer, machine learning, (19 more...)

arXiv.org Machine Learning

2605.26693

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

PCA of high dimensional random walks with comparison to neural network training

Neural Information Processing SystemsMar-16-2026, 22:28:47 GMT

One technique to visualize the training of neural networks is to perform PCA on the parameters over the course of training and to project to the subspace spanned by the first few PCA components. In this paper we compare this technique to the PCA of a high dimensional random walk. We compute the eigenvalues and eigenvectors of the covariance of the trajectory and prove that in the long trajectory and high dimensional limit most of the variance is in the first few PCA components, and that the projection of the trajectory onto any subspace spanned by PCA components is a Lissajous curve. We generalize these results to a random walk with momentum and to an Ornstein-Uhlenbeck processes (i.e., a random walk in a quadratic potential) and show that in high dimensions the walk is not mean reverting, but will instead be trapped at a fixed distance from the minimum. We finally analyze PCA projected training trajectories for: a linear model trained on CIFAR-10; a fully connected model trained on MNIST; and ResNet-50-v2 trained on Imagenet. In all cases, both the distribution of PCA eigenvalues and the projected trajectories resemble those of a random walk with drift.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.43)

Add feedback

PCA of high dimensional random walks with comparison to neural network training

Joseph Antognini, Jascha Sohl-Dickstein

Neural Information Processing SystemsFeb-13-2026, 08:40:36 GMT

Neural Information Processing Systems http://nips.cc/

pca component, random walk, trajectory, (13 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

PCA of high dimensional random walks with comparison to neural network training

Joseph Antognini, Jascha Sohl-Dickstein

Neural Information Processing SystemsNov-20-2025, 23:32:43 GMT

Deep neural networks (NNs) are extremely high dimensional objects.

artificial intelligence, machine learning, random walk, (17 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

PCA of high dimensional random walks with comparison to neural network training

Neural Information Processing SystemsNov-20-2025, 22:27:57 GMT

high dimensional random walk, name change, random walk, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.43)

Add feedback

Self-supervised Synthetic Pretraining for Inference of Stellar Mass Embedded in Dense Gas

Hirashima, Keiya, Nozaki, Shingo, Harada, Naoto

arXiv.org Artificial IntelligenceOct-29-2025

Stellar mass is a fundamental quantity that determines the properties and evolution of stars. However, estimating stellar masses in star-forming regions is challenging because young stars are obscured by dense gas and the regions are highly inhomogeneous, making spherical dynamical estimates unreliable. Supervised machine learning could link such complex structures to stellar mass, but it requires large, high-quality labeled datasets from high-resolution magneto-hydrodynamical (MHD) simulations, which are computationally expensive. We address this by pretraining a vision transformer on one million synthetic fractal images using the self-supervised framework DINOv2, and then applying the frozen model to limited high-resolution MHD simulations. Our results demonstrate that synthetic pretraining improves frozen-feature regression stellar mass predictions, with the pretrained model performing slightly better than a supervised model trained on the same limited simulations. Principal component analysis of the extracted features further reveals semantically meaningful structures, suggesting that the model enables unsupervised segmentation of star-forming regions without the need for labeled data or fine-tuning.

artificial intelligence, machine learning, simulation, (15 more...)

arXiv.org Artificial Intelligence

2510.24159

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.15)
Asia > Japan > Kyūshū & Okinawa > Kyūshū (0.14)

Genre: Research Report > New Finding (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

VLSI Hypergraph Partitioning with Deep Learning

Khan, Muhammad Hadir, Onal, Bugra, Dogan, Eren, Guthaus, Matthew R.

arXiv.org Artificial IntelligenceSep-2-2024

Partitioning is a known problem in computer science and is critical in chip design workflows, as advancements in this area can significantly influence design quality and efficiency. Deep Learning (DL) techniques, particularly those involving Graph Neural Networks (GNNs), have demonstrated strong performance in various node, edge, and graph prediction tasks using both inductive and transductive learning methods. A notable area of recent interest within GNNs are pooling layers and their application to graph partitioning. While these methods have yielded promising results across social, computational, and other random graphs, their effectiveness has not yet been explored in the context of VLSI hypergraph netlists. In this study, we introduce a new set of synthetic partitioning benchmarks that emulate real-world netlist characteristics and possess a known upper bound for solution cut quality. We distinguish these benchmarks with the prior work and evaluate existing state-of-the-art partitioning algorithms alongside GNN-based approaches, highlighting their respective advantages and disadvantages.

graph, node, partition, (16 more...)

arXiv.org Artificial Intelligence

2409.01387

Country:

North America > United States > California > Santa Cruz County > Santa Cruz (0.14)
Asia (0.14)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Semiconductors & Electronics (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.84)

Add feedback

Classifying point clouds at the facade-level using geometric features and deep learning networks

Tan, Yue, Wysocki, Olaf, Hoegner, Ludwig, Stilla, Uwe

arXiv.org Artificial IntelligenceFeb-9-2024

3D building models with facade details are playing an important role in many applications now. Classifying point clouds at facade-level is key to create such digital replicas of the real world. However, few studies have focused on such detailed classification with deep neural networks. We propose a method fusing geometric features with deep learning networks for point cloud classification at facade-level. Our experiments conclude that such early-fused features improve deep learning methods' performance. This method can be applied for compensating deep learning networks' ability in capturing local geometric information and promoting the advancement of semantic segmentation.

classification, geometric feature, point cloud, (15 more...)

arXiv.org Artificial Intelligence

2402.06506

Country: Europe > Germany > Bavaria > Upper Bavaria > Munich (0.06)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Neural Network Characterization and Entropy Regulated Data Balancing through Principal Component Analysis

Yevick, David, Hutchison, Karolina

arXiv.org Artificial IntelligenceDec-3-2023

This paper examines the relationship between the behavior of a neural network and the distribution formed from the projections of the data records into the space spanned by the low-order principal components of the training data. For example, in a benchmark calculation involving rotated and unrotated MNIST digits, classes (digits) that are mapped far from the origin in a low-dimensional principal component space and that overlap minimally with other digits converge rapidly and exhibit high degrees of accuracy in neural network calculations that employ the associated components of each data record as inputs. Further, if the space spanned by these low-order principal components is divided into bins and the input data records that are mapped into a given bin averaged, the resulting pattern can be distinguished by its geometric features which interpolate between those of adjacent bins in an analogous manner to variational autoencoders. Based on this observation, a simply realized data balancing procedure can be realized by evaluating the entropy associated with each histogram bin and subsequently repeating the original image data associated with the bin by a number of times that is determined from this entropy.

accuracy, digit, neural network, (15 more...)

arXiv.org Artificial Intelligence

2312.01392

Country:

North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report (0.84)

Industry:

Health & Medicine > Therapeutic Area (0.93)
Law Enforcement & Public Safety (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

The Double Helix inside the NLP Transformer

Lu, Jason H. J., Guo, Qingzhen

arXiv.org Artificial IntelligenceJun-23-2023

We introduce a framework for analyzing various types of information in an NLP Transformer. In this approach, we distinguish four layers of information: positional, syntactic, semantic, and contextual. We also argue that the common practice of adding positional information to semantic embedding is sub-optimal and propose instead a Linear-and-Add approach. Our analysis reveals an autogenetic separation of positional information through the deep layers. We show that the distilled positional components of the embedding vectors follow the path of a helix, both on the encoder side and on the decoder side. We additionally show that on the encoder side, the conceptual dimensions generate Part-of-Speech (PoS) clusters. On the decoder side, we show that a di-gram approach helps to reveal the PoS clusters of the next token. Our approach paves a way to elucidate the processing of information through the deep layers of an NLP Transformer.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2306.13817

Country:

North America > United States > Missouri > St. Louis County > St. Louis (0.14)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback