When the Left Foot Leads to the Right Path: Bridging Initial Prejudice and Trainability

Bassi, Alberto, Albert, Carlo, Lucchi, Aurelien, Baity-Jesi, Marco, Francazi, Emanuele

May-27-2025–arXiv.org Machine Learning

Understanding the statistical properties of deep neural networks (DNNs) at initialization is crucial for elucidating both their trainability and the intrinsic architectural biases they encode prior to data exposure. Mean-field (MF) analyses have demonstrated that the parameter distribution in randomly initialized networks dictates whether gradients vanish or explode. Concurrently, untrained DNNs were found to exhibit an initial-guessing bias (IGB), in which large regions of the input space are assigned to a single class. In this work, we derive a theoretical proof establishing the correspondence between IGB and previous MF theories, thereby connecting a network prejudice toward specific classes with the conditions for fast and accurate learning. This connection yields the counter-intuitive conclusion: the initialization that optimizes trainability is necessarily biased, rather than neutral. Furthermore, we extend the MF/IGB framework to multi-node activation functions, offering practical guidelines for designing initialization schemes that ensure stable optimization in architectures employing max- and average-pooling layers.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Machine Learning

May-27-2025

arXiv.org PDF

Add feedback

Country:
- North America
  - United States
    - Virginia (0.04)
    - California > Alameda County
      - Berkeley (0.04)
  - Canada > Ontario
    - Toronto (0.14)
- Europe
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - Switzerland
    - Zürich > Zürich (0.14)
    - Basel-City > Basel (0.04)
    - Vaud > Lausanne (0.04)

Genre:
- Research Report (0.82)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found