computation burden
Graph Decipher: A transparent dual-attention graph neural network to understand the message-passing mechanism for the node classification
Graph neural networks can be effectively applied to find solutions for many real-world problems across widely diverse fields. The success of graph neural networks is linked to the message-passing mechanism on the graph, however, the message-aggregating behavior is still not entirely clear in most algorithms. To improve functionality, we propose a new transparent network called Graph Decipher to investigate the message-passing mechanism by prioritizing in two main components: the graph structure and node attributes, at the graph, feature, and global levels on a graph under the node classification task. However, the computation burden now becomes the most significant issue because the relevance of both graph structure and node attributes are computed on a graph. In order to solve this issue, only relevant representative node attributes are extracted by graph feature filters, allowing calculations to be performed in a category-oriented manner. Experiments on seven datasets show that Graph Decipher achieves state-of-the-art performance while imposing a substantially lower computation burden under the node classification task. Additionally, since our algorithm has the ability to explore the representative node attributes by category, it is utilized to alleviate the imbalanced node classification problem on multi-class graph datasets.
Infinite Memory Transformer: Attending to Arbitrarily Long Contexts Without Increasing Computation Burden
When reading a novel, humans naturally remember relevant plot information even if it was presented many chapters earlier. Although today's transformer-based language models have made impressive progress in natural language processing, they struggle in this regard, as the compute required for modelling long-term memories grows quadratically with the length of the text and will eventually exceed the model's finite memory capacity. To overcome this limitation, a research team from Instituto de Telecomunicações, DeepMind, Institute of Systems and Robotics, Instituto Superior Técnico and Unbabel has proposed " -former" (infinite former) -- a transformer model equipped with unbounded long-term memory (LTM) that enables it to attend to arbitrarily long contexts. The team extends the vanilla transformer with a continuous LTM to enable their proposed -former to access long-range context. The novel approach employs a continuous space attention framework to attend over the LTM signal, in which key matrix size depends on the number of basis functions instead of the length of the context being attended to.