thermodynamic limit
Tuning Universality in Deep Neural Networks
Deep neural networks (DNNs) exhibit crackling-like avalanches whose origin lacks a mechanistic explanation. Here, I derive a stochastic theory of deep information propagation (DIP) by incorporating Central Limit Theorem (CLT)-level fluctuations. Four effective couplings $(r, h, D_1, D_2)$ characterize the dynamics, yielding a Landau description of the static exponents and a Directed Percolation (DP) structure of activity cascades. Tuning the couplings selects between avalanche dynamics generated by a Brownian Motion (BM) in a logarithmic trap and an absorbed free BM, each corresponding to a distinct universality classes. Numerical simulations confirm the theory and demonstrate that activation function design controls the collective dynamics in random DNNs.
Clarifying the Ti-V Phase Diagram Using First-Principles Calculations and Bayesian Learning
Miryashkin, Timofei, Klimanova, Olga, Shapeev, Alexander
Conflicting experiments disagree on whether the titanium-vanadium (Ti-V) binary alloy exhibits a body-centred cubic (BCC) miscibility gap or remains completely soluble. A leading hypothesis attributes the miscibility gap to oxygen contamination during alloy preparation. To resolve this disagreement, we use an ab initio + machine-learning workflow that couples an actively-trained Moment Tensor Potential with Bayesian inference of free energy surface. This workflow enables construction of the Ti-V phase diagram across the full composition range with systematically reduced statistical and finite-size errors. The resulting diagram reproduces all experimental features, demonstrating the robustness of our approach, and clearly favors the variant with a BCC miscibility gap terminating at T = 980 K and c = 0.67. Because our simulations model a perfectly oxygen-free Ti-V system, the observed gap cannot originate from impurity effects, in contrast to recent CALPHAD reassessments.
- North America > United States > New York (0.04)
- Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (2 more...)
- Information Technology > Modeling & Simulation (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Beyond Disorder: Unveiling Cooperativeness in Multidirectional Associative Memories
Alessandrelli, Andrea, Barra, Adriano, Ladiana, Andrea, Lepre, Andrea, Ricci-Tersenghi, Federico
By leveraging tools from the statistical mechanics of complex systems, in these short notes we extend the architecture of a neural network for hetero-associative memory (called three-directional associative memories, TAM) to explore supervised and unsupervised learning protocols. In particular, by providing entropic-heterogeneous datasets to its various layers, we predict and quantify a new emergent phenomenon -- that we term {\em layer's cooperativeness} -- where the interplay of dataset entropies across network's layers enhances their retrieval capabilities Beyond those they would have without reciprocal influence. Naively we would expect layers trained with less informative datasets to develop smaller retrieval regions compared to those pertaining to layers that experienced more information: this does not happen and all the retrieval regions settle to the same amplitude, allowing for optimal retrieval performance globally. This cooperative dynamics marks a significant advancement in understanding emergent computational capabilities within disordered systems.
Reviews: Data-Dependence of Plateau Phenomenon in Learning with Neural Network --- Statistical Mechanical Analysis
It would make more sense to show results for data with low-dimensional structure, in which the first one or two are non-zero, and the rest are either zero or epsilon small. Do the conclusions for the two eigenvalues case still hold in this example? It is hard for me to see what I should learn from figures 5 and 6. - The dependence of the learning dynamics on the spectral properties of the input data is not new and was previously studies by Saxe et al. (ArXiv, 2013) for simple linear networks. It would be appropriate if these results were mentioned or discussed in the text. It has been previously showed that the initial conditions have a big impact on the trainability and learning dynamics of these networks. In this case, they would be defined as the initial conditions on the order parameters Q, R, and D. - The analysis here seems tractable only for networks with a small number of hidden units.
Generalization vs. Specialization under Concept Shift
Nguyen, Alex, Schwab, David J., Ngampruetikorn, Vudtiwat
Machine learning models are often brittle under distribution shift, i.e., when data distributions at test time differ from those during training. Understanding this failure mode is central to identifying and mitigating safety risks of mass adoption of machine learning. Here we analyze ridge regression under concept shift -- a form of distribution shift in which the input-label relationship changes at test time. We derive an exact expression for prediction risk in the high-dimensional limit. Our results reveal nontrivial effects of concept shift on generalization performance, depending on the properties of robust and nonrobust features of the input. We show that test performance can exhibit a nonmonotonic data dependence, even when double descent is absent. Finally, our experiments on MNIST and FashionMNIST suggest that this intriguing behavior is present also in classification problems.
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Selection pressure/Noise driven cooperative behaviour in the thermodynamic limit of repeated games
Consider the scenario where an infinite number of players (i.e., the \textit{thermodynamic} limit) find themselves in a Prisoner's dilemma type situation, in a \textit{repeated} setting. Is it reasonable to anticipate that, in these circumstances, cooperation will emerge? This paper addresses this question by examining the emergence of cooperative behaviour, in the presence of \textit{noise} (or, under \textit{selection pressure}), in repeated Prisoner's Dilemma games, involving strategies such as \textit{Tit-for-Tat}, \textit{Always Defect}, \textit{GRIM}, \textit{Win-Stay, Lose-Shift}, and others. To analyze these games, we employ a numerical Agent-Based Model (ABM) and compare it with the analytical Nash Equilibrium Mapping (NEM) technique, both based on the \textit{1D}-Ising chain. We use \textit{game magnetization} as an indicator of cooperative behaviour. A significant finding is that for some repeated games, a discontinuity in the game magnetization indicates a \textit{first}-order \textit{selection pressure/noise}-driven phase transition. The phase transition is particular to strategies where players do not severely punish a single defection. We also observe that in these particular cases, the phase transition critically depends on the number of \textit{rounds} the game is played in the thermodynamic limit. For all five games, we find that both ABM and NEM, in conjunction with game magnetization, provide crucial inputs on how cooperative behaviour can emerge in an infinite-player repeated Prisoner's dilemma game.
- North America > United States > New York (0.04)
- Asia > India > Maharashtra > Mumbai (0.04)
From Empirical Observations to Universality: Dynamics of Deep Learning with Inputs Built on Gaussian mixture
This study broadens the scope of theoretical frameworks in deep learning by delving into the dynamics of neural networks with inputs that demonstrate the structural characteristics to Gaussian Mixture (GM). We analyzed how the dynamics of neural networks under GM-structured inputs diverge from the predictions of conventional theories based on simple Gaussian structures. A revelation of our work is the observed convergence of neural network dynamics towards conventional theory even with standardized GM inputs, highlighting an unexpected universality. We found that standardization, especially in conjunction with certain nonlinear functions, plays a critical role in this phenomena. Consequently, despite the complex and varied nature of GM distributions, we demonstrate that neural networks exhibit asymptotic behaviors in line with predictions under simple Gaussian frameworks.
Game susceptibility, Correlation and Payoff capacity as a measure of Cooperative behavior in the thermodynamic limit of some Social dilemmas
Analytically, finding the origins of cooperative behavior in infinite-player games is an exciting topic of current interest. Previously, cooperative behavior has been studied by considering game magnetization and individual player's average payoff as indicators. This paper shows that game susceptibility, correlation, and payoff capacity can aid in understanding cooperative behavior in social dilemmas in the thermodynamic limit. In this paper, we compare three analytical methods, i.e., Nash equilibrium mapping (NEM), Darwinian selection (DS), and Aggregate selection (AS), with a numerical-based method (ABM) via the game susceptibility, correlation, and payoff capacity as indicators of cooperative behavior. AS and DS fail compared to NEM and ABM by giving incorrect results for the indicators in question. The results obtained via NEM and ABM are in good agreement for all three indicators in question, for both Hawk-Dove and the Public goods games. After comparing the results obtained for all five indicators, we see that individual players' average payoff and payoff capacity are the best indicators to study cooperative behavior among players in the thermodynamic limit. This paper finds that NEM and ABM, along with the selected indicators, offer valuable insights into cooperative behavior in infinite-player games, contributing to understanding social dilemmas in the thermodynamic limit.
- South America > Brazil > São Paulo (0.04)
- North America > United States > New York (0.04)
- Asia > India > Maharashtra > Mumbai (0.04)
- Leisure & Entertainment > Games (1.00)
- Social Sector (0.81)
Unsupervised and Supervised learning by Dense Associative Memory under replica symmetry breaking
Albanese, Linda, Alessandrelli, Andrea, Annibale, Alessia, Barra, Adriano
Statistical mechanics of spin glasses is one of the main strands toward a comprehension of information processing by neural networks and learning machines. Tackling this approach, at the fairly standard replica symmetric level of description, recently Hebbian attractor networks with multi-node interactions (often called Dense Associative Memories) have been shown to outperform their classical pairwise counterparts in a number of tasks, from their robustness against adversarial attacks and their capability to work with prohibitively weak signals to their supra-linear storage capacities. Focusing on mathematical techniques more than computational aspects, in this paper we relax the replica symmetric assumption and we derive the one-step broken-replica-symmetry picture of supervised and unsupervised learning protocols for these Dense Associative Memories: a phase diagram in the space of the control parameters is achieved, independently, both via the Parisi's hierarchy within then replica trick as well as via the Guerra's telescope within the broken-replica interpolation. Further, an explicit analytical investigation is provided to deepen both the big-data and ground state limits of these networks as well as a proof that replica symmetry breaking does not alter the thresholds for learning and slightly increases the maximal storage capacity. Finally the De Almeida and Thouless line, depicting the onset of instability of a replica symmetric description, is also analytically derived highlighting how, crossed this boundary, the broken replica description should be preferred.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- (4 more...)
- Information Technology > Security & Privacy (0.34)
- Government (0.34)
- Information Technology > Artificial Intelligence > Systems & Languages > Programming Languages (0.81)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.81)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Dense Hebbian neural networks: a replica symmetric picture of unsupervised learning
Agliari, Elena, Albanese, Linda, Alemanno, Francesco, Alessandrelli, Andrea, Barra, Adriano, Giannotti, Fosca, Lotito, Daniele, Pedreschi, Dino
We consider dense, associative neural-networks trained with no supervision and we investigate their computational capabilities analytically, via a statistical-mechanics approach, and numerically, via Monte Carlo simulations. In particular, we obtain a phase diagram summarizing their performance as a function of the control parameters such as the quality and quantity of the training dataset and the network storage, valid in the limit of large network size and structureless datasets. Moreover, we establish a bridge between macroscopic observables standardly used in statistical mechanics and loss functions typically used in the machine learning. As technical remarks, from the analytic side, we implement large deviations and stability analysis within Guerra's interpolation to tackle the not-Gaussian distributions involved in the post-synaptic potentials while, from the computational counterpart, we insert Plefka approximation in the Monte Carlo scheme, to speed up the evaluation of the synaptic tensors, overall obtaining a novel and broad approach to investigate neural networks in general.
- Europe > Italy > Tuscany > Pisa Province > Pisa (0.04)
- North America > United States > Texas > Coleman County (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Israel (0.04)