Review for NeurIPS paper: SAC: Accelerating and Structuring Self-Attention via Sparse Adaptive Connection

May-31-2025, 19:07:03 GMT–Neural Information Processing Systems

Weaknesses: My main concern is about the computational cost the proposed method. The method requires running a LSTM on each token on every layer (or even every head) sequentially. Compared to the parallel processing of Transformers, I would expect this sequential computation to be quite slow. All those factors should affect the computation speed in a negative way. Given that the computational efficiency is the goal of the paper, the authors must discuss them in detail.

accelerating and structuring self-attention, neurips paper, sparse adaptive connection, (3 more...)

Neural Information Processing Systems

May-31-2025, 19:07:03 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)