AITopics | positive eigenvalue

Collaborating Authors

positive eigenvalue

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Dynamical Properties of Tokens in Self-Attention and Effects of Positional Encoding

Neural Information Processing SystemsJun-23-2026, 01:19:35 GMT

This paper investigates the dynamical properties of tokens in pre-trained Transformer models and explores their application to improving Transformers. To this end, we analyze the dynamical system governing the continuous-time limit of the pre-trained model and characterize the asymptotic behavior of its solutions. Specifically, we characterize when tokens move closer to or farther from one another over time, depending on the model parameters. We provide sufficient conditions, based on these parameters, to identify scenarios where tokens either converge to zero or diverge to infinity. Unlike prior works, our conditions are broader in scope and more applicable to real-world models. Furthermore, we investigate how different forms of positional encoding - specifically absolute and rotary - affect these dynamical regimes. Empirical evidence reveals that the convergence scenario adversely impacts model performance. Motivated by these insights, we propose simple refinements to Transformer architectures that mitigate convergence behavior in models with absolute or rotary positional encoding. These findings support theoretical foundations and design principles for improving Transformer models.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

Asia (1.00)
North America > United States > Minnesota (0.28)

Genre:

Research Report > Experimental Study (1.00)
Overview (0.87)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Dynamical Properties of Tokens in Self-Attention and Effects of Positional Encoding

Pham, Duy-Tung, Nguyen, An The, Tran, Viet-Hoang, Chung, Nhan-Phu, Tong, Xin T., Nguyen, Tan M., Vo, Thieu N.

arXiv.org Artificial IntelligenceDec-4-2025

This paper investigates the dynamical properties of tokens in pre-trained Transformer models and explores their application to improving Transformers. To this end, we analyze the dynamical system governing the continuous-time limit of the pre-trained model and characterize the asymptotic behavior of its solutions. Specifically, we characterize when tokens move closer to or farther from one another over time, depending on the model parameters. We provide sufficient conditions, based on these parameters, to identify scenarios where tokens either converge to zero or diverge to infinity. Unlike prior works, our conditions are broader in scope and more applicable to real-world models. Furthermore, we investigate how different forms of positional encoding -- specifically absolute and rotary -- affect these dynamical regimes. Empirical evidence reveals that the convergence scenario adversely impacts model performance. Motivated by these insights, we propose simple refinements to Transformer architectures that mitigate convergence behavior in models with absolute or rotary positional encoding. These findings support theoretical foundations and design principles for improving Transformer models.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2512.03058

Country:

Asia (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Neuc-MDS: Non-Euclidean Multidimensional Scaling Through Bilinear Forms

Neural Information Processing SystemsNov-20-2025, 05:14:28 GMT

We introduce Non-Euc lidean-MDS (Neuc-MDS), an extension of classical Multidimensional Scaling (MDS) that accommodates non-Euclidean and non-metric inputs. The main idea is to generalize the standard inner product to symmetric bilinear forms to utilize the negative eigenvalues of dissimilarity Gram matrices. Neuc-MDS efficiently optimizes the choice of (both positive and negative) eigenvalues of the dissimilarity Gram matrix to reduce STRESS, the sum of squared pairwise error. We provide an in-depth error analysis and proofs of the optimality in minimizing lower bounds of STRESS. We demonstrate Neuc-MDS's ability to address limitations of classical MDS raised by prior research, and test it on various synthetic and real-world datasets in comparison with both linear and non-linear dimension reduction methods.

artificial intelligence, eigenvalue, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > South Holland > Delft (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.93)

Add feedback

dc36c213f20300d1381520b0ce0c7788-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 18:38:55 GMT

dataset, eigenvalue, matrix, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > South Holland > Delft (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.93)

Add feedback

Memorisation and forgetting in a learning Hopfield neural network: bifurcation mechanisms, attractors and basins

Essex, Adam E., Janson, Natalia B., Norris, Rachel A., Balanov, Alexander G.

arXiv.org Artificial IntelligenceAug-15-2025

Despite explosive expansion of artificial intelligence based on artificial neural networks (ANNs), these are employed as "black boxes'', as it is unclear how, during learning, they form memories or develop unwanted features, including spurious memories and catastrophic forgetting. Much research is available on isolated aspects of learning ANNs, but due to their high dimensionality and non-linearity, their comprehensive analysis remains a challenge. In ANNs, knowledge is thought to reside in connection weights or in attractor basins, but these two paradigms are not linked explicitly. Here we comprehensively analyse mechanisms of memory formation in an 81-neuron Hopfield network undergoing Hebbian learning by revealing bifurcations leading to formation and destruction of attractors and their basin boundaries. We show that, by affecting evolution of connection weights, the applied stimuli induce a pitchfork and then a cascade of saddle-node bifurcations creating new attractors with their basins that can code true or spurious memories, and an abrupt disappearance of old memories (catastrophic forgetting). With successful learning, new categories are represented by the basins of newly born point attractors, and their boundaries by the stable manifolds of new saddles. With this, memorisation and forgetting represent two manifestations of the same mechanism. Our strategy to analyse high-dimensional learning ANNs is universal and applicable to recurrent ANNs of any form. The demonstrated mechanisms of memory formation and of catastrophic forgetting shed light on the operation of a wider class of recurrent ANNs and could aid the development of approaches to mitigate their flaws.

artificial intelligence, bifurcation, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2508.10765

Country: Europe > United Kingdom > England (0.46)

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology (0.46)
Transportation (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Neuc-MDS: Non-Euclidean Multidimensional Scaling Through Bilinear Forms

Deng, Chengyuan, Gao, Jie, Lu, Kevin, Luo, Feng, Sun, Hongbin, Xin, Cheng

arXiv.org Machine LearningDec-28-2024

We introduce Non-Euclidean-MDS (Neuc-MDS), an extension of classical Multidimensional Scaling (MDS) that accommodates non-Euclidean and non-metric inputs. The main idea is to generalize the standard inner product to symmetric bilinear forms to utilize the negative eigenvalues of dissimilarity Gram matrices. Neuc-MDS efficiently optimizes the choice of (both positive and negative) eigenvalues of the dissimilarity Gram matrix to reduce STRESS, the sum of squared pairwise error. We provide an in-depth error analysis and proofs of the optimality in minimizing lower bounds of STRESS. We demonstrate Neuc-MDS's ability to address limitations of classical MDS raised by prior research, and test it on various synthetic and real-world datasets in comparison with both linear and non-linear dimension reduction methods.

artificial intelligence, eigenvalue, machine learning, (18 more...)

arXiv.org Machine Learning

2411.10889

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > South Holland > Delft (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Understanding Memorization in Generative Models via Sharpness in Probability Landscapes

Jeon, Dongjae, Kim, Dueun, No, Albert

arXiv.org Artificial IntelligenceDec-5-2024

In this paper, we introduce a geometric framework to analyze memorization in diffusion models using the eigenvalues of the Hessian of the log probability density. We propose that memorization arises from isolated points in the learned probability distribution, characterized by sharpness in the probability landscape, as indicated by large negative eigenvalues of the Hessian. Through experiments on various datasets, we demonstrate that these eigenvalues effectively detect and quantify memorization. Our approach provides a clear understanding of memorization in diffusion models and lays the groundwork for developing strategies to ensure secure and reliable generative models

diffusion model, eigenvalue, memorization, (13 more...)

arXiv.org Artificial Intelligence

2412.0414

Country:

Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)

Add feedback

Reviews: How Much Restricted Isometry is Needed In Nonconvex Matrix Recovery?

Neural Information Processing SystemsOct-8-2024, 10:27:15 GMT

Summary It has recently been proved that simple non-convex algorithms can recover a low-rank matrix from linear measurements, provided that the measurement operator satisfies a Restricted Isometry Property, with a good enough constant. Indeed, in this case, the natural objective function associated with the problem has no second-order critical point other than the solution. In this article, the authors ask how good the constant must be so that this property holds. They propose a convex program that finds measurement operators satisfying RIP, but for which the objective function has "bad" second-order critical points. They analyze this program in the case where the matrix to be recovered has rank 1, and deduce from this analysis that, for any x,z, there exists a RIP operator such that x is a bad second-order critical point when the true unknown is zz*; they upper bound the RIP constant.

critical point, equation, second-order critical point, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.48)

Add feedback

An Interpretable Evaluation of Entropy-based Novelty of Generative Models

Zhang, Jingwei, Li, Cheuk Ting, Farnia, Farzan

arXiv.org Machine LearningFeb-27-2024

The massive developments of generative model frameworks and architectures require principled methods for the evaluation of a model's novelty compared to a reference dataset or baseline generative models. While the recent literature has extensively studied the evaluation of the quality, diversity, and generalizability of generative models, the assessment of a model's novelty compared to a baseline model has not been adequately studied in the machine learning community. In this work, we focus on the novelty assessment under multi-modal generative models and attempt to answer the following question: Given the samples of a generative model $\mathcal{G}$ and a reference dataset $\mathcal{S}$, how can we discover and count the modes expressed by $\mathcal{G}$ more frequently than in $\mathcal{S}$. We introduce a spectral approach to the described task and propose the Kernel-based Entropic Novelty (KEN) score to quantify the mode-based novelty of distribution $P_\mathcal{G}$ with respect to distribution $P_\mathcal{S}$. We analytically interpret the behavior of the KEN score under mixture distributions with sub-Gaussian components. Next, we develop a method based on Cholesky decomposition to compute the KEN score from observed samples. We support the KEN-based quantification of novelty by presenting several numerical results on synthetic and real image distributions. Our numerical results indicate the success of the proposed approach in detecting the novel modes and the comparison of state-of-the-art generative models.

eigenvalue, generative model, novel mode, (13 more...)

arXiv.org Machine Learning

2402.17287

Country:

Asia > China > Hong Kong (0.04)
North America > United States (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Seven Sins of Numerical Linear Algebra

#artificialintelligenceNov-14-2022, 02:05:53 GMT

Symmetric positive definite matrices (symmetric matrices with positive eigenvalues) are ubiquitous, not least because they arise in the solution of many minimization problems. However, a matrix that is supposed to be positive definite may fail to be so for a variety of reasons. Missing or inconsistent data in forming a covariance matrix or a correlation matrix can cause a loss of definiteness, and rounding errors can cause a tiny positive eigenvalue to go negative. The best way to check definiteness is to compute a Cholesky factorization, which is often needed anyway. The MATLAB function chol returns an error message if the factorization fails, and a second output argument can be requested, which is set to the number of the stage on which the factorization failed, or to zero if the factorization succeeded.

matrix, numerical linear algebra, positive eigenvalue, (6 more...)

#artificialintelligence

Technology:

Information Technology > Mathematics of Computing (0.40)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.40)

Add feedback