Self Expanding Neural Networks
Mitchell, Rupert, Mundt, Martin, Kersting, Kristian
–arXiv.org Artificial Intelligence
The results of training a neural network are heavily dependent on the architecture chosen; and even a modification of only the size of the network, however small, typically involves restarting the training process. In contrast to this, we begin training with a small architecture, only increase its capacity as necessary for the problem, and avoid interfering with previous optimization while doing so. We thereby introduce a natural gradient based approach which intuitively expands both the width and depth of a neural network when this is likely to substantially reduce the hypothetical converged training loss. We prove an upper bound on the "rate" at which neurons are added, and a computationally cheap lower bound on the expansion score. We illustrate the benefits of such Self-Expanding Neural Networks in both classification and regression problems, including those where the appropriate architecture size is substantially uncertain a priori.
arXiv.org Artificial Intelligence
Jul-11-2023
- Country:
- Africa > Ethiopia
- Addis Ababa > Addis Ababa (0.04)
- Asia > Middle East
- Israel > Haifa District > Haifa (0.04)
- Europe
- France > Hauts-de-France
- Germany > Hesse
- Darmstadt Region > Darmstadt (0.05)
- North America
- Canada > British Columbia
- United States
- Massachusetts > Suffolk County
- Boston (0.04)
- New York (0.04)
- Massachusetts > Suffolk County
- Africa > Ethiopia
- Genre:
- Research Report (0.50)
- Technology: