renormalization
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Sleep-Based Homeostatic Regularization for Stabilizing Spike-Timing-Dependent Plasticity in Recurrent Spiking Neural Networks
Massey, Andreas, Hubin, Aliaksandr, Nichele, Stefano, Sæbø, Solve
Spike-timing-dependent plasticity (STDP) provides a biologically-plausible learning mechanism for spiking neural networks (SNNs); however, Hebbian weight updates in architectures with recurrent connections suffer from pathological weight dynamics: unbounded growth, catastrophic forgetting, and loss of representational diversity. We propose a neuromorphic regularization scheme inspired by the synaptic homeostasis hypothesis: periodic offline phases during which external inputs are suppressed, synaptic weights undergo stochastic decay toward a homeostatic baseline, and spontaneous activity enables memory consolidation. We demonstrate that this sleep-wake cycle prevents weight saturation while preserving learned structure. Empirically, we find that low to intermediate sleep durations (10-20\% of training) improve stability on MNIST-like benchmarks in our STDP-SNN model, without any data-specific hyperparameter tuning. In contrast, the same sleep intervention yields no measurable benefit for the surrogate-gradient spiking neural network (SG-SNN). Taken together, these results suggest that periodic, sleep-based renormalization may represent a fundamental mechanism for stabilizing local Hebbian learning in neuromorphic systems, while also indicating that special care is required when integrating such protocols with existing gradient-based optimization methods.
Correcting Mean Bias in Text Embeddings: A Refined Renormalization with Training-Free Improvements on MMTEB
Ren, Xingyu, Sun, Youran, Liang, Haoyu
We find that current text embedding models produce outputs with a consistent bias, i.e., each embedding vector $e$ can be decomposed as $\tilde{e} + μ$, where $μ$ is almost identical across all sentences. We propose a plug-and-play, training-free and lightweight solution called Renormalization. Through extensive experiments, we show that renormalization consistently and statistically significantly improves the performance of existing models on the Massive Multilingual Text Embedding Benchmark (MMTEB). In particular, across 38 models, renormalization improves performance by 9.7 $σ$ on retrieval tasks, 3.1 $σ$ on classification tasks, and 0.8 $σ$ on other types of tasks. Renormalization has two variants: directly subtracting $μ$ from $e$, or subtracting the projection of $e$ onto $μ$. We theoretically predict that the latter performs better, and our experiments confirm this prediction.
Decom-Renorm-Merge: Model Merging on the Right Space Improves Multitasking
Chaichana, Yuatyong, Trachu, Thanapat, Limkonchotiwat, Peerat, Preechakul, Konpat, Khandhawit, Tirasan, Chuangsuwanich, Ekapol
In the era of large-scale training, model merging has evolved into a tool for creating multitasking models efficiently. It enables the knowledge of models to be fused, without the need for heavy computation as required in traditional multitask learning. Existing merging methods often assume that entries at identical positions in weight matrices serve the same function, enabling straightforward entry-wise comparison and merging. However, this assumption overlooks the complexity of finetuned neural networks, where neurons may develop distinct feature compositions, making direct entry-wise merging problematic. We present Decom-Renorm-Merge (DRM), a simple yet effective approach that leverages Singular Value Decomposition to decompose and coordinate weight matrices into an aligned joint space, where entry-wise merging becomes possible. We showcase the effectiveness of DRM across various settings ranging from smaller encoder-based such as ViT and DeBERTa, encoder-decoder-based such as T5, and larger decoder-based such as Llama3.1-8B. Our experimental results show that DRM outperforms several state-of-the-art merging techniques across full finetuning and low-rank adaptation settings. Moreover, our analysis reveals renormalization as the crucial component for creating a robust and even joint space for merging, significantly contributing to the method's performance.
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (6 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Spectral functions in Minkowski quantum electrodynamics from neural reconstruction: Benchmarking against dispersive Dyson--Schwinger integral equations
A Minkowskian physics-informed neural network approach (M--PINN) is formulated to solve the Dyson--Schwinger integral equations (DSE) of quantum electrodynamics (QED) directly in Minkowski spacetime. Our novel strategy merges two complementary approaches: (i) a dispersive solver based on Lehmann representations and subtracted dispersion relations, and (ii) a M--PINN that learns the fermion mass function $B(p^2)$, under the same truncation and renormalization configuration (quenched, rainbow, Landau gauge) with the loss integrating the DSE residual with multi--scale regularization, and monotonicity/smoothing penalties in the spacelike branch in the same way as in our previous work in Euclidean space. The benchmarks show quantitative agreement from the infrared (IR) to the ultraviolet (UV) scales in both on-shell and momentum-subtraction schemes. In this controlled setting, our M--PINN reproduces the dispersive solution whilst remaining computationally compact and differentiable, paving the way for extensions with realistic vertices, unquenching effects, and uncertainty-aware variants.
- North America > United States (0.04)
- Europe > Spain > Galicia > Madrid (0.04)
Viability of perturbative expansion for quantum field theories on neurons
Accelerated progress in machine learning (ML) over the past decade has had significant impact across many research domains, including physics, and has motivated substantial interdisciplinary work. At the intersection of physics and machine learning, two prominent practical questions have emerged: 1. Can techniques from statistical mechanics and the path integral formulation of quantum field theory (QFT) help us build a theoretical understanding of how neural networks learn? 2. Can neural networks be used to facilitate computations in quantum field theory? These two questions are deeply interrelated, and will motivate the questions we explore in this work. The second question itself splits naturally into two subcategories: (a) applied machine learning for physics problems, and (b) the theoretical interplay between machine learning and QFT techniques. The area of applied ML to physics has already seen considerable progress.
- North America > United States > South Dakota > Clay County > Vermillion (0.14)
- North America > United States > Iowa > Story County > Ames (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Diffusion-Guided Renormalization of Neural Systems via Tensor Networks
Far from equilibrium, neural systems self-organize across multiple scales. Exploiting multiscale self-organization in neuroscience and artificial intelligence requires a computational framework for modeling the effective non-equilibrium dynamics of stochastic neural trajectories. Non-equilibrium thermodynamics and representational geometry offer theoretical foundations, but we need scalable data-driven techniques for modeling collective properties of high-dimensional neural networks from partial subsampled observations. Renormalization is a coarse-graining technique central to studying emergent scaling properties of many-body and nonlinear dynamical systems. While widely applied in physics and machine learning, coarse-graining complex dynamical networks remains unsolved, affecting many computational sciences. Recent diffusion-based renormalization, inspired by quantum statistical mechanics, coarse-grains networks near entropy transitions marked by maximal changes in specific heat or information transmission. Here I explore diffusion-based renormalization of neural systems by generating symmetry-breaking representations across scales and offering scalable algorithms using tensor networks. Diffusion-guided renormalization bridges microscale and mesoscale dynamics of dissipative neural systems. For microscales, I developed a scalable graph inference algorithm for discovering community structure from subsampled neural activity. Using community-based node orderings, diffusion-guided renormalization generates renormalization group flow through metagraphs and joint probability functions. Towards mesoscales, diffusion-guided renormalization targets learning the effective non-equilibrium dynamics of dissipative neural trajectories occupying lower-dimensional subspaces, enabling coarse-to-fine control in systems neuroscience and artificial intelligence.
- North America > United States (0.28)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
SPICED: A Synaptic Homeostasis-Inspired Framework for Unsupervised Continual EEG Decoding
Zhou, Yangxuan, Zhao, Sha, Wang, Jiquan, Jiang, Haiteng, Li, Shijian, Li, Tao, Pan, Gang
Human brain achieves dynamic stability-plasticity balance through synaptic homeostasis. Inspired by this biological principle, we propose SPICED: a neuromorphic framework that integrates the synaptic homeostasis mechanism for unsupervised continual EEG decoding, particularly addressing practical scenarios where new individuals with inter-individual variability emerge continually. SPICED comprises a novel synaptic network that enables dynamic expansion during continual adaptation through three bio-inspired neural mechanisms: (1) critical memory reactivation; (2) synaptic consolidation and (3) synaptic renormalization. The interplay within synaptic homeostasis dynamically strengthens task-discriminative memory traces and weakens detrimental memories. By integrating these mechanisms with continual learning system, SPICED preferentially replays task-discriminative memory traces that exhibit strong associations with newly emerging individuals, thereby achieving robust adaptations. Meanwhile, SPICED effectively mitigates catastrophic forgetting by suppressing the replay prioritization of detrimental memories during long-term continual learning. Validated on three EEG datasets, SPICED show its effectiveness.
- North America > United States (0.14)
- Asia > China > Zhejiang Province > Hangzhou (0.04)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.93)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Orthogonal Gradient Descent Improves Neural Calibration
We provide evidence that orthogonalizing gradients during training improves model calibration without sacrificing accuracy. On CIFAR-10 with 10\% labeled data, $\perp$Grad matches SGD in accuracy but yields consistently improved calibration metrics such as lower test loss, reduced softmax overconfidence, and higher predictive entropy. These benefits persist under input corruption (CIFAR-10C) and extended training, where $\perp$Grad models degrade more gracefully than SGD-trained counterparts. $\perp$Grad is optimizer-agnostic, incurs minimal overhead, and works well with post-hoc calibration techniques like temperature scaling. Theoretically, we prove convergence of a simplified version of $\perp$Grad under mild assumptions and characterize its stationary points in positive homogeneous networks: $\perp$Grad converges to solutions where further loss reduction requires confidence scaling rather than decision boundary improvement. Code for this paper can be found at: https://github.com/evanshedges2/orthograd\_improves\_calibration.
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)