Goto

Collaborating Authors

 gln



Appendix: OnlineLearninginContextualBandits usingGatedLinearNetworks

Neural Information Processing Systems

Weassume that our tree divides the bounded reward range[rmin,rmax] uniformly into2d bins at each leveld D. By labelling left branches ofanode by0,and right branches with a1,we can associate aunique binary stringb1:d to any single internal (d < D) or leaf (d = D) node in the tree. Thedth element, when it exists, is denoted asbd. The root node is denoted by empty string . We should note that even though this exponential term might initially seem discouraging, we setD = 3in our experiments and observe no significant improvements for largerD. Algorithm 1 CTREE, performs regression utilizing a tree-based discetization, where nodes are composedofGLNs.





'Hello, World!': Making GNNs Talk with LLMs

Kim, Sunwoo, Lee, Soo Yong, Yoo, Jaemin, Shin, Kijung

arXiv.org Artificial Intelligence

While graph neural networks (GNNs) have shown remarkable performance across diverse graph-related tasks, their high-dimensional hidden representations render them black boxes. In this work, we propose Graph Lingual Network (GLN), a GNN built on large language models (LLMs), with hidden representations in the form of human-readable text. Through careful prompt design, GLN incorporates not only the message passing module of GNNs but also advanced GNN techniques, including graph attention and initial residual connection. The comprehensibility of GLN's hidden representations enables an intuitive analysis of how node representations change (1) across layers and (2) under advanced GNN techniques, shedding light on the inner workings of GNNs. Furthermore, we demonstrate that GLN achieves strong zero-shot performance on node classification and link prediction, outperforming existing LLM-based baseline methods.





The Implicit Bias of Gradient Descent on Generalized Gated Linear Networks

Lippl, Samuel, Abbott, L. F., Chung, SueYeon

arXiv.org Machine Learning

Understanding the asymptotic behavior of gradient-descent training of deep neural networks is essential for revealing inductive biases and improving network performance. We derive the infinite-time training limit of a mathematically tractable class of deep nonlinear neural networks, gated linear networks (GLNs), and generalize these results to gated networks described by general homogeneous polynomials. We study the implications of our results, focusing first on two-layer GLNs. We then apply our theoretical predictions to GLNs trained on MNIST and show how architectural constraints and the implicit bias of gradient descent affect performance. Finally, we show that our theory captures a substantial portion of the inductive bias of ReLU networks. By making the inductive bias explicit, our framework is poised to inform the development of more efficient, biologically plausible, and robust learning algorithms.