Nonlinear random matrix theory for deep learning
Jeffrey Pennington, Pratik Worah
–Neural Information Processing Systems
Neural network configurations with random weights play an important role in the analysis of deep learning. They define the initial loss landscape and are closely related to kernel and random feature methods. Despite the fact that these networks are built out of random matrices, the vast and powerful machinery of random matrix theory has so far found limited success in studying them. A main obstacle in this direction is that neural networks are nonlinear, which prevents the straightforward utilization of many of the existing mathematical results. In this work, we open the door for direct applications of random matrix theory to deep learning by demonstrating that the pointwise nonlinearities typically applied in neural networks can be incorporated into a standard method of proof in random matrix theory known as the moments method.
Neural Information Processing Systems
Oct-2-2024, 18:16:33 GMT
- Country:
- Asia > Russia (0.04)
- Europe > Russia (0.04)
- North America
- Canada > Ontario
- Toronto (0.14)
- United States
- California > Los Angeles County
- Long Beach (0.04)
- Rhode Island > Providence County
- Providence (0.04)
- California > Los Angeles County
- Canada > Ontario
- Technology: