Graph Neural Networks for Learning Equivariant Representations of Neural Networks

Kofinas, Miltiadis, Knyazev, Boris, Zhang, Yan, Chen, Yunlu, Burghouts, Gertjan J., Gavves, Efstratios, Snoek, Cees G. M., Zhang, David W.

arXiv.org Machine Learning 

Neural networks that process the parameters of other neural networks find applications in domains as diverse as classifying implicit neural representations, generating neural network weights, and predicting generalization errors. However, existing approaches either overlook the inherent permutation symmetry in the neural network or rely on intricate weight-sharing patterns to achieve equivariance, while ignoring the impact of the network architecture itself. In this work, we propose to represent neural networks as computational graphs of parameters, which allows us to harness powerful graph neural networks and transformers that preserve permutation symmetry. Consequently, our approach enables a single model to learn from neural graphs with diverse architectures. How can we design neural networks that themselves take neural network parameters as input? This would allow us to make inferences about neural networks, such as predicting their generalization error (Unterthiner et al., 2020), generating neural network weights (Schürholt et al., 2022a), and classifying or generating implicit neural representations (Dupont et al., 2022) without having to evaluate them on many different inputs. For simplicity, let us consider a deep neural network with multiple hidden layers. As a naïve approach, we can simply concatenate all flattened weights and biases into one large feature vector, from which we can then make predictions as usual. However, this overlooks an important structure in the parameters: neurons in a layer can be reordered while maintaining exactly the same function (Hecht-Nielsen, 1990). Reordering neurons of a neural network means permuting the preceding and following weight matrices accordingly. Ignoring the permutation symmetry will typically cause this model to make different predictions for different orderings of the neurons in the input neural network, even though they represent exactly the same function. In general, accounting for symmetries in the input data improves the learning efficiency and underpins the field of geometric deep learning (Bronstein et al., 2021). Recent studies (Navon et al., 2023; Zhou et al., 2023a) confirm the effectiveness of equivariant layers for parameter spaces (the space of neural network parameters) with specially designed weight-sharing patterns. These weight-sharing patterns, however, require manual adaptation to each new architectural design. Importantly, a single model can only process neural network parameters for a single fixed architecture.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found