renormalization group
Hierarchical Maximum Entropy via the Renormalization Group
Hierarchical structures, which include multiple levels, are prevalent in statistical and machine-learning models as well as physical systems. Extending the foundational result that the maximum entropy distribution under mean constraints is given by the exponential Gibbs-Boltzmann form, we introduce the framework of "hierarchical maximum entropy" to address these multilevel models. We demonstrate that Pareto optimal distributions, which maximize entropies across all levels of hierarchical transformations, can be obtained via renormalization-group procedures from theoretical physics. This is achieved by formulating multilevel extensions of the Gibbs variational principle and the Donsker-Varadhan variational representation of entropy. Moreover, we explore settings with hierarchical invariances that significantly simplify the renormalization-group procedures, enhancing computational efficiency: quadratic modular loss functions, logarithmic loss functions, and nearest-neighbor loss functions. This is accomplished through the introduction of the concept of parameter flows, which serves as an analog to renormalization flows in renormalization group theory. This work connects ideas from probability theory, information theory, and statistical mechanics.
Dreaming up scale invariance via inverse renormalization group
Rançon, Adam, Rançon, Ulysse, Ivek, Tomislav, Balog, Ivan
We explore how minimal neural networks can invert the renormalization group (RG) coarse-graining procedure in the two-dimensional Ising model, effectively "dreaming up" microscopic configurations from coarse-grained states. This task-formally impossible at the level of configurations-can be approached probabilistically, allowing machine learning models to reconstruct scale-invariant distributions without relying on microscopic input. We demonstrate that even neural networks with as few as three trainable parameters can learn to generate critical configurations, reproducing the scaling behavior of observables such as magnetic susceptibility, heat capacity, and Binder ratios. A real-space renormalization group analysis of the generated configurations confirms that the models capture not only scale invariance but also reproduce nontrivial eigenvalues of the RG transformation. Surprisingly, we find that increasing network complexity by introducing multiple layers offers no significant benefit. These findings suggest that simple local rules, akin to those generating fractal structures, are sufficient to encode the universality of critical phenomena, opening the door to efficient generative models of statistical ensembles in physics.
Multilevel Generative Samplers for Investigating Critical Phenomena
Singha, Ankur, Cellini, Elia, Nicoli, Kim A., Jansen, Karl, Kühn, Stefan, Nakajima, Shinichi
Investigating critical phenomena or phase transitions is of high interest in physics and chemistry, for which Monte Carlo (MC) simulations, a crucial tool for numerically analyzing macroscopic properties of given systems, are often hindered by an emerging divergence of correlation length -- known as scale invariance at criticality (SIC) in the renormalization group theory. SIC causes the system to behave the same at any length scale, from which many existing sampling methods suffer: long-range correlations cause critical slowing down in Markov chain Monte Carlo (MCMC), and require intractably large receptive fields for generative samplers. In this paper, we propose a Renormalization-informed Generative Critical Sampler (RiGCS) -- a novel sampler specialized for near-critical systems, where SIC is leveraged as an advantage rather than a nuisance. Specifically, RiGCS builds on MultiLevel Monte Carlo (MLMC) with Heat Bath (HB) algorithms, which perform ancestral sampling from low-resolution to high-resolution lattice configurations with site-wise-independent conditional HB sampling. Although MLMC-HB is highly efficient under exact SIC, it suffers from a low acceptance rate under slight SIC violation. Notably, SIC violation always occurs in finite-size systems, and may induce long-range and higher-order interactions in the renormalized distributions, which are not considered by independent HB samplers. RiGCS enhances MLMC-HB by replacing a part of the conditional HB sampler with generative models that capture those residual interactions and improve the sampling efficiency. Our experiments show that the effective sample size of RiGCS is a few orders of magnitude higher than state-of-the-art generative model baselines in sampling configurations for 128x128 two-dimensional Ising systems.
GUD: Generation with Unified Diffusion
Gerdes, Mathis, Welling, Max, Cheng, Miranda C. N.
Diffusion generative models transform noise into data by inverting a process that progressively adds noise to data samples. Inspired by concepts from the renormalization group in physics, which analyzes systems across different scales, we revisit diffusion models by exploring three key design aspects: 1) the choice of representation in which the diffusion process operates (e.g. pixel-, PCA-, Fourier-, or wavelet-basis), 2) the prior distribution that data is transformed into during diffusion (e.g. Gaussian with covariance $\Sigma$), and 3) the scheduling of noise levels applied separately to different parts of the data, captured by a component-wise noise schedule. Incorporating the flexibility in these choices, we develop a unified framework for diffusion generative models with greatly enhanced design freedom. In particular, we introduce soft-conditioning models that smoothly interpolate between standard diffusion models and autoregressive models (in any basis), conceptually bridging these two approaches. Our framework opens up a wide design space which may lead to more efficient training and data generation, and paves the way to novel architectures integrating different generative approaches and generation tasks.
Similarity Learning with neural networks
Sanfins, Gabriel, Ramos, Fabio, Naiff, Danilo
Understanding and predicting the behavior of complex physical systems is a cornerstone of scientific and engineering endeavors. In fluid mechanics, for instance, accurately simulating real operational conditions is essential for the design and optimization of pipelines, aerospace components, and various industrial processes. However, full-scale simulations of such systems are often prohibitively expensive and time-consuming due to the intricate dynamics and vast parameter spaces involved. This poses a significant challenge for researchers and engineers who seek to explore and optimize these systems efficiently. One promising approach to mitigate these challenges is the identification of scaling similarities and symmetry groups within physical systems. By uncovering the correct scaling relations, we can develop smaller, more manageable models that accurately capture the essential behavior of real-world scenarios. These scaled models not only reduce computational costs but also accelerate the design and testing processes by allowing for efficient exploration of the parameter space. Moreover, understanding these scaling laws deepens our insight into the fundamental principles governing these systems, enabling us to generalize findings from simplified models to full-scale applications with greater confidence. In recent years, the application of machine learning in fluid mechanics has been on the rise, offering innovative tools to address complex problems that are difficult to solve analytically.
MS$^3$D: A RG Flow-Based Regularization for GAN Training with Limited Data
Wang, Jian, Lan, Xin, Tian, Yuxin, Lv, Jiancheng
Generative adversarial networks (GANs) have made impressive advances in image generation, but they often require large-scale training data to avoid degradation caused by discriminator overfitting. To tackle this issue, we investigate the challenge of training GANs with limited data, and propose a novel regularization method based on the idea of renormalization group (RG) in physics.We observe that in the limited data setting, the gradient pattern that the generator obtains from the discriminator becomes more aggregated over time. In RG context, this aggregated pattern exhibits a high discrepancy from its coarse-grained versions, which implies a high-capacity and sensitive system, prone to overfitting and collapse. To address this problem, we introduce a \textbf{m}ulti-\textbf{s}cale \textbf{s}tructural \textbf{s}elf-\textbf{d}issimilarity (MS$^3$D) regularization, which constrains the gradient field to have a consistent pattern across different scales, thereby fostering a more redundant and robust system. We show that our method can effectively enhance the performance and stability of GANs under limited data scenarios, and even allow them to generate high-quality images with very few data.
Generating configurations of increasing lattice size with machine learning and the inverse renormalization group
We review recent developments of machine learning algorithms pertinent to the inverse renormalization group, which was originally established as a generative numerical method by Ron-Swendsen-Brandt via the implementation of compatible Monte Carlo simulations. Inverse renormalization group methods enable the iterative generation of configurations for increasing lattice size without the critical slowing down effect. We discuss the construction of inverse renormalization group transformations with the use of convolutional neural networks and present applications in models of statistical mechanics, lattice field theory, and disordered systems. We highlight the case of the three-dimensional Edwards-Anderson spin glass, where the inverse renormalization group can be employed to construct configurations for lattice volumes that have not yet been accessed by dedicated supercomputers. Inverse renormalization group methods were originally established as generative numerical techniques by Ron-Swendsen-Brandt via the implementation of compatible Monte Carlo simulations [1].
Supervised Learning with Tensor Networks
Tensor networks are approximations of high-order tensors which are efficient to work with and have been very successful for physics and mathematics applications. We demonstrate how algorithms for optimizing tensor networks can be adapted to supervised learning tasks by using matrix product states (tensor trains) to parameterize non-linear kernel learning models. For the MNIST data set we obtain less than 1% test set classification error. We discuss an interpretation of the additional structure imparted by the tensor network to the learned model.
Data driven modeling of self-similar dynamics
Tao, Ru-yi, Tao, Ning-ning, You, Yi-zhuang, Zhang, Jiang
Complex systems modeling is essential for understanding, predicting, and even controlling a complex system. Due to the non-linear, self-organizing, emergence, and other complex behaviors in them, modeling complex systems has always been challenging. In recent decades, data-driven approaches, leading by machine learning, have shown significant advantages in so many fields, which inspired us to do better in modeling complex systems. On the other hand, self-similarity is a common feature of complex systems. From natural systems, like the fractal structure of vegetation clusters in the Amazon rainforest and the Tibetan plateau[1], the critical phenomena in atmospheric precipitation[2], to societal systems like network traffic[3], the avalanche of public opinion in social medias[4], and neural system like critical phenomenon in brain[5, 6, 7, 8] and so on, there are so many evidences of scale-invariant properties in complex systems. Thus, we're motivated to integrate self-similarity as prior knowledge, aiming for data-driven multi-scale modeling of complex systems. Two aspects of modeling complex systems are network structure and dynamics.