diagonal
Path-conditioned training: a principled way to rescale ReLU neural networks
Lebeurrier, Arthur, Vayer, Titouan, Gribonval, Rémi
Despite recent algorithmic advances, we still lack principled ways to leverage the well-documented rescaling symmetries in ReLU neural network parameters. While two properly rescaled weights implement the same function, the training dynamics can be dramatically different. To offer a fresh perspective on exploiting this phenomenon, we build on the recent path-lifting framework, which provides a compact factorization of ReLU networks. We introduce a geometrically motivated criterion to rescale neural network parameters which minimization leads to a conditioning strategy that aligns a kernel in the path-lifting space with a chosen reference. We derive an efficient algorithm to perform this alignment. In the context of random network initialization, we analyze how the architecture and the initialization scale jointly impact the output of the proposed method. Numerical experiments illustrate its potential to speed up training.
- North America > United States > California (0.04)
- Europe > Germany (0.04)
- Europe > France > Brittany > Ille-et-Vilaine > Rennes (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Oceania > Tonga (0.04)
- North America > United States > Indiana > Hamilton County > Fishers (0.04)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- North America > United States > Massachusetts (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- (2 more...)
- South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
- Asia > South Korea > Seoul > Seoul (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Transportation > Passenger (0.68)
- Transportation > Air (0.46)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
- North America > Canada > Ontario > Toronto (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- North America > United States > California > Santa Clara County > Los Gatos (0.04)
- North America > Canada (0.04)