QAMA: Scalable Quantum Annealing Multi-Head Attention Operator for Deep Learning

Du, Peng, Shi, Jinjing, Wang, Wenxuan, Ma, Yin, Wen, Kai, Li, Xuelong

Oct-14-2025–arXiv.org Artificial Intelligence

Attention mechanisms underpin modern deep learning, while the quadratic time and space complexity limit scalability for long sequences. To address this, Quantum Annealing Multi-Head Attention (QAMA) is proposed, a novel drop-in operator that reformulates attention as an energy-based Hamiltonian optimization problem. In this framework, token interactions are encoded into binary quadratic terms, and quantum annealing is employed to search for low-energy configurations that correspond to effective attention patterns. Unlike classical sparse or approximate attention methods that rely on hand-crafted heuristics, QAMA allows sparsity structures to emerge naturally from the optimization process. Theoretically, computational complexity is analysed through single-spin flip dynamics, providing time to solution runtime bounds that depend on the spectral properties of the annealing Hamiltonian. Empirically, evaluation on both natural language and vision benchmarks shows that, across tasks, accuracy deviates by at most 2.7 points from standard multi-head attention, while requiring only linear qubits in sequence length. Visualizations further reveal that the Hamiltonian penalty terms induce meaningful and interpretable sparsity across heads. Finally, deployment on a coherent Ising machine validates the feasibility of running QAMA on real quantum hardware, showing tangible inference-time reductions compared with classical implementations. These results highlight QAMA as a pioneering and scalable step toward integrating quantum optimization devices into deep neural architectures, providing a seamlessly integrable and hardware-compatible alternative to conventional attention mechanisms. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

Oct-14-2025

arXiv.org PDF

Add feedback

Country:
- Asia
  - China
    - Beijing > Beijing (0.04)
    - Hubei Province > Wuhan (0.04)
    - Hunan Province > Changsha (0.04)
  - Japan > Honshū
    - Kantō > Tochigi Prefecture > Utsunomiya (0.04)
- North America > United States
  - Oregon > Multnomah County
    - Portland (0.04)
  - Washington > King County
    - Seattle (0.04)

Genre:
- Research Report (1.00)

Industry:
- Energy > Power Industry (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language (1.00)
  - Representation & Reasoning (1.00)