Goto

Collaborating Authors

 ising


Performance Evaluation of Ising and QUBO Variable Encodings in Boltzmann Machine Learning

arXiv.org Artificial Intelligence

We compare Ising ({-1,+1}) and QUBO ({0,1}) encodings for Boltzmann machine learning under a controlled protocol that fixes the model, sampler, and step size. Exploiting the identity that the Fisher information matrix (FIM) equals the covariance of sufficient statistics, we visualize empirical moments from model samples and reveal systematic, representation-dependent differences. QUBO induces larger cross terms between first- and second-order statistics, creating more small-eigenvalue directions in the FIM and lowering spectral entropy. This ill-conditioning explains slower convergence under stochastic gradient descent (SGD). In contrast, natural gradient descent (NGD)-which rescales updates by the FIM metric-achieves similar convergence across encodings due to reparameterization invariance. Practically, for SGD-based training, the Ising encoding provides more isotropic curvature and faster convergence; for QUBO, centering/scaling or NGD-style preconditioning mitigates curvature pathologies. These results clarify how representation shapes information geometry and finite-time learning dynamics in Boltzmann machines and yield actionable guidelines for variable encoding and preprocessing.


Learning performance in inverse Ising problems with sparse teacher couplings

arXiv.org Machine Learning

In the teacher-student scenario under the assumption that the teacher's couplings are sparse and the student does not know the graphical structure, the learning curve and order parameters are assessed in the typical case using the replica and cavity methods from statistical mechanics. Our formulation is also applicable to a certain class of cost functions having locality; the standard likelihood does not belong to that class. The derived analytical formulas indicate that the perfect inference of the presence/absence of the teacher's couplings is possible in the thermodynamic limit taking the number of spins N as infinity while keeping the dataset size M proportional to N, as long as α M/N 2. Meanwhile, the formulas also show that the estimated coupling values corresponding to the truly existing ones in the teacher tend to be overestimated in the absolute value, manifesting the presence of estimation bias. These results are considered to be exact in the thermodynamic limit on locally treelike networks, such as the regular random or Erd os-R enyi graphs. Numerical simulation results fully support the theoretical predictions. Additional biases in the estimators on loopy graphs are also discussed. 1 Introduction Inference based on the classical Ising model is called the inverse Ising problem or Boltzmann machine learning, which is attracting more and more attention with the increasing interest in machine learning technologies.


From complex to simple : hierarchical free-energy landscape renormalized in deep neural networks

arXiv.org Machine Learning

We develop a statistical mechanical approach based on the replica method to study the solution space of deep neural networks. Specifically we analyze the configuration space of the synaptic weights in a simple feed-forward perceptron network within a Gaussian approximation for two scenarios : a setting with random inputs/outputs and a teacher-student setting. By increasing the strength of constraints, i. e. increasing the number of imposed patterns, successive 2nd order glass transition (random inputs/outputs) or 2nd order crystalline transition (teacher-student setting) take place place layer-by-layer starting next to the inputs/outputs boundaries going deeper into the bulk. For deep enough network the central part of the network remains in the liquid phase. We argue that in systems of finite width, weak bias field remain in the central part and plays the role of a symmetry breaking field which connects the opposite sides of the system. In the setting with random inputs/outputs, the successive glass transitions bring about a hierarchical free-energy landscape with ultra-metricity, which evolves in space: it is most complex close to the boundaries but becomes renormalized into progressively simpler one in deeper layers. These observations provide clues to understand why deep neural networks operate efficiently. Finally we present results of a set of numerical simulations to examine the theoretical predictions.