AITopics | spikformer

QKFormer: Hierarchical Spiking Transformer using Q-K Attention

Neural Information Processing SystemsMar-18-2026, 12:24:15 GMT

Spiking Transformers, which integrate Spiking Neural Networks (SNNs) with Transformer architectures, have attracted significant attention due to their potential for low energy consumption and high performance. However, there remains a substantial gap in performance between SNNs and Artificial Neural Networks (ANNs). To narrow this gap, we have developed QKFormer, a direct training spiking transformer with the following features: i), the novel spike-form Q-K attention module efficiently models the token or channel attention through binary vectors and enables the construction of larger models.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

179f5dcdeedc149443ebd3ba70811dbd-Paper-Conference.pdf

Neural Information Processing SystemsFeb-19-2026, 02:06:05 GMT

ItisshownthatQKFormer achieves significantly superior performance over existing state-of-the-art SNN models on various mainstream datasets.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.68)

Add feedback

Spiking Token Mixer: An Event-Driven Friendly Former Structure for Spiking Neural Networks

Neural Information Processing SystemsFeb-18-2026, 13:05:43 GMT

Compared to the clock-driven synchronous chip, the event-driven asynchronous chip achieves much lower energy consumption but only supports some specific network operations. Recently, a series of SNN projects have achieved tremendous success, significantly improving the SNN's performance. However, event-driven asynchronous chips do not support some of the proposed structures, making it impossible to integrate these SNNs into asynchronous hardware.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology (0.46)
Energy (0.35)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.93)

Add feedback

2f55a8b7b1c2c6312eb86557bb9a2bd5-Paper-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 14:24:21 GMT

Spiking neural networks (SNNs) represent a promising approach to developing artificial neural networks that are both energy-efficient and biologically plausible.

artificial intelligence, machine learning, neural network, (18 more...)

Neural Information Processing Systems

Country: Africa > Rwanda > Kigali > Kigali (0.04)

Genre: Research Report (0.88)

Industry: Energy (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Spiking Token Mixer: An Event-Driven Friendly Former Structure for Spiking Neural Networks

Neural Information Processing SystemsOct-10-2025, 20:09:13 GMT

Compared to the clock-driven synchronous chip, the event-driven asynchronous chip achieves much lower energy consumption but only supports some specific network operations. Recently, a series of SNN projects have achieved tremendous success, significantly improving the SNN's performance. However, event-driven asynchronous chips do not support some of the proposed structures, making it impossible to integrate these SNNs into asynchronous hardware.

architecture, module, neural network, (15 more...)

Neural Information Processing Systems

Country: Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology (0.46)
Energy (0.35)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.93)

Add feedback

Advancing Spiking Neural Networks for Sequential Modeling with Central Pattern Generators

Neural Information Processing SystemsOct-9-2025, 22:24:57 GMT

Spiking neural networks (SNNs) represent a promising approach to developing artificial neural networks that are both energy-efficient and biologically plausible.

cpg-pe, information, neural network, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County (0.04)
Asia > China (0.04)
Africa > Rwanda > Kigali > Kigali (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Energy > Power Industry (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

QKFormer: Hierarchical Spiking Transformer using Q-K Attention

Neural Information Processing SystemsOct-9-2025, 19:35:02 GMT

As the architecture of the transformers is essential to the model's performance [

experiment, qkformer, spikformer, (17 more...)

Neural Information Processing Systems

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Cognitive Science (0.67)

Add feedback

STAS: Spatio-Temporal Adaptive Computation Time for Spiking Transformers

Kang, Donghwa, Kim, Doohyun, Ko, Sang-Ki, Lee, Jinkyu, Kang, Brent ByungHoon, Baek, Hyeongboo

arXiv.org Artificial IntelligenceAug-21-2025

Spiking neural networks (SNNs) offer energy efficiency over artificial neural networks (ANNs) but suffer from high latency and computational overhead due to their multi-timestep operational nature. While various dynamic computation methods have been developed to mitigate this by targeting spatial, temporal, or architecture-specific redundancies, they remain fragmented. While the principles of adaptive computation time (ACT) offer a robust foundation for a unified approach, its application to SNN-based vision Transformers (ViTs) is hindered by two core issues: the violation of its temporal similarity prerequisite and a static architecture fundamentally unsuited for its principles. To address these challenges, we propose STAS (Spatio-Temporal Adaptive computation time for Spiking transformers), a framework that co-designs the static architecture and dynamic computation policy. STAS introduces an integrated spike patch splitting (I-SPS) module to establish temporal stability by creating a unified input representation, thereby solving the architectural problem of temporal dissimilarity. This stability, in turn, allows our adaptive spiking self-attention (A-SSA) module to perform two-dimensional token pruning across both spatial and temporal axes. Implemented on spiking Transformer architectures and validated on CIFAR-10, CIFAR-100, and ImageNet, STAS reduces energy consumption by up to 45.9%, 43.8%, and 30.1%, respectively, while simultaneously improving accuracy over SOTA models.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2508.14138

Genre: Research Report (0.64)

Industry: Energy (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

MSVIT: Improving Spiking Vision Transformer Using Multi-scale Attention Fusion

Hua, Wei, Zhou, Chenlin, Wu, Jibin, Chua, Yansong, Shu, Yangyang

arXiv.org Artificial IntelligenceJun-19-2025

The combination of Spiking Neural Networks (SNNs) with Vision Transformer architectures has garnered significant attention due to their potential for energy-efficient and high-performance computing paradigms. However, a substantial performance gap still exists between SNN-based and ANN-based transformer architectures. While existing methods propose spiking self-attention mechanisms that are successfully combined with SNNs, the overall architectures proposed by these methods suffer from a bottleneck in effectively extracting features from different image scales. In this paper, we address this issue and propose MSVIT. This novel spike-driven Transformer architecture firstly uses multi-scale spiking attention (MSSA) to enhance the capabilities of spiking attention blocks. We validate our approach across various main datasets. The experimental results show that MSVIT outperforms existing SNN-based models, positioning itself as a state-of-the-art solution among SNN-transformer architectures. The codes are available at https://github.com/Nanhu-AI-Lab/MSViT.

machine learning, msvit, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.14719

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

QKFormer: Hierarchical Spiking Transformer using Q-K Attention

Neural Information Processing SystemsMay-26-2025, 17:22:03 GMT

Spiking Transformers, which integrate Spiking Neural Networks (SNNs) with Transformer architectures, have attracted significant attention due to their potential for low energy consumption and high performance. However, there remains a substantial gap in performance between SNNs and Artificial Neural Networks (ANNs). To narrow this gap, we have developed QKFormer, a direct training spiking transformer with the following features: i) Linear complexity and high energy efficiency, the novel spike-form Q-K attention module efficiently models the token or channel attention through binary vectors and enables the construction of larger models. It is shown that QKFormer achieves significantly superior performance over existing state-of-the-art SNN models on various mainstream datasets. To our best knowledge, this is the first time that directly training SNNs have exceeded 85\% accuracy on ImageNet-1K.

artificial intelligence, hierarchical spiking transformer, machine learning, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Filters

Collaborating Authors

spikformer

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

QKFormer: Hierarchical Spiking Transformer using Q-K Attention

179f5dcdeedc149443ebd3ba70811dbd-Paper-Conference.pdf

Spiking Token Mixer: An Event-Driven Friendly Former Structure for Spiking Neural Networks

2f55a8b7b1c2c6312eb86557bb9a2bd5-Paper-Conference.pdf

Spiking Token Mixer: An Event-Driven Friendly Former Structure for Spiking Neural Networks

Advancing Spiking Neural Networks for Sequential Modeling with Central Pattern Generators

QKFormer: Hierarchical Spiking Transformer using Q-K Attention

STAS: Spatio-Temporal Adaptive Computation Time for Spiking Transformers

MSVIT: Improving Spiking Vision Transformer Using Multi-scale Attention Fusion

QKFormer: Hierarchical Spiking Transformer using Q-K Attention