AITopics | accelerating and structuring self-attention

Collaborating Authors

accelerating and structuring self-attention

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SAC: Accelerating and Structuring Self-Attention via Sparse Adaptive Connection

Neural Information Processing SystemsDec-24-2025, 14:11:48 GMT

While the self-attention mechanism has been widely used in a wide variety of tasks, it has the unfortunate property of a quadratic cost with respect to the input length, which makes it difficult to deal with long inputs. In this paper, we present a method for accelerating and structuring self-attentions: Sparse Adaptive Connection (SAC). In SAC, we regard the input sequence as a graph and attention operations are performed between linked nodes. In contrast with previous self-attention models with pre-defined structures (edges), the model learns to construct attention edges to improve task-specific performances. In this way, the model is able to select the most salient nodes and reduce the quadratic complexity regardless of the sequence length. Based on SAC, we show that previous variants of self-attention models are its special cases. Through extensive experiments on neural machine translation, language modeling, graph representation learning and image classification, we demonstrate SAC is competitive with state-of-the-art models while significantly reducing memory cost.

accelerating and structuring self-attention, name change, sparse adaptive connection, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.98)

Add feedback

Supplementary Materials for SAC: Accelerating and Structuring Self-Attention via Sparse Adaptive Connection 1 Datasets

Neural Information Processing SystemsAug-16-2025, 08:24:24 GMT

For the transductive setup, we used the three standard citation network benchmarks, Cora, Cite-seer and Pubmed (Sen et al., 2008). We followed the transductive setup adopted in (Y ang et al., Cora contains 2708 nodes, 5429 edges, 7 classes and 1433 features per node. Citeseer contains 3327 nodes, 4732 edges, 6 classes and 3703 features per node. Critically, testing graphs remain completely unobserved during training. The average number of nodes per graph is 2372.

accelerating and structuring self-attention, arxiv preprint arxiv, node, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois (0.07)
North America > Canada (0.06)

Industry: Government > Regional Government > North America Government > United States Government (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.57)

Add feedback

Review for NeurIPS paper: SAC: Accelerating and Structuring Self-Attention via Sparse Adaptive Connection

Neural Information Processing SystemsMay-31-2025, 19:07:03 GMT

Weaknesses: My main concern is about the computational cost the proposed method. The method requires running a LSTM on each token on every layer (or even every head) sequentially. Compared to the parallel processing of Transformers, I would expect this sequential computation to be quite slow. All those factors should affect the computation speed in a negative way. Given that the computational efficiency is the goal of the paper, the authors must discuss them in detail.

accelerating and structuring self-attention, neurips paper, sparse adaptive connection, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Review for NeurIPS paper: SAC: Accelerating and Structuring Self-Attention via Sparse Adaptive Connection

Neural Information Processing SystemsMay-31-2025, 19:06:55 GMT

This paper addresses the quadratic bottleneck in transformer architecture. It proposes a Sparse Adaptive Connection (SAC) model which learns to predict sparse connections (attention links) between inputs and attentions are only performed on those predictive links. The proposed method is competitive with state-of-the-art models on WMT, LM and Image classification tasks while significantly reducing memory cost. Overall, three of the four reviewers seem to have liked the paper, although they had some concerns (below), while one reviewer (R3) proposed weak reject. A weakness pointed out by R2 and R3 is that only accuracy is reported, but speed is not, which seems necessary to support the title "Accelerating Self-Attention". The authors promised to add more details about computational efficiency and memory cost in the final version, and I urge them to do so.

accelerating and structuring self-attention, reviewer, sparse adaptive connection, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.60)

Add feedback

SAC: Accelerating and Structuring Self-Attention via Sparse Adaptive Connection

Neural Information Processing SystemsOct-11-2024, 07:30:39 GMT

accelerating and structuring self-attention, sac, sparse adaptive connection, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.43)

Add feedback