Towards Theoretically Inspired Neural Initialization Optimization

Jan-13-2025, 20:17:47 GMT–Neural Information Processing Systems

Automated machine learning has been widely explored to reduce human efforts in designing neural architectures and looking for proper hyperparameters. In the domain of neural initialization, however, similar automated techniques have rarely been studied. Most existing initialization methods are handcrafted and highly dependent on specific architectures. In this paper, we propose a differentiable quantity, named GradCoisne, with theoretical insights to evaluate the initial state of a neural network. Specifically, GradCosine is the cosine similarity of sample-wise gradients with respect to the initialized parameters.

architecture, neural architecture, neural initialization optimization, (2 more...)

Neural Information Processing Systems

Jan-13-2025, 20:17:47 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.86)