Quantum Complex-Valued Self-Attention Model
Chen, Fu, Zhao, Qinglin, Feng, Li, Tang, Longfei, Lin, Yangbin, Huang, Haitao
–arXiv.org Artificial Intelligence
--Self-attention has revolutionized classical machine learning, yet existing quantum self-attention models underuti-lize quantum states' potential due to oversimplified or incomplete mechanisms. T o address this limitation, we introduce the Quantum Complex-V alued Self-Attention Model (QCSAM), the first framework to leverage complex-valued similarities, which captures amplitude and phase relationships between quantum states more comprehensively. T o achieve this, QCSAM extends the Linear Combination of Unitaries (LCUs) into the Complex LCUs (CLCUs) framework, enabling precise complex-valued weighting of quantum states and supporting quantum multi-head attention. Experiments on MNIST and Fashion-MNIST show that QCSAM outperforms recent quantum self-attention models, including QKSAN, QSAN, and GQHAN. With only 4 qubits, QCSAM achieves 100% and 99.2% test accuracies on MNIST and Fashion-MNIST, respectively. Furthermore, we evaluate scalability across 3-8 qubits and 2-4 class tasks, while ablation studies validate the advantages of complex-valued attention weights over real-valued alternatives. I NTRODUCTION The self-attention mechanism, as a key component of deep learning architectures, has significantly impacted the ways in which data is processed and features are learned [1]-[3]. By generating adaptive attention weights, self-attention not only highlights key features in the data but also integrates global contextual information, thereby improving the expressive power and computational efficiency of deep learning systems. For instance, in natural language processing [4]-[6], self-attention has enhanced language understanding and generation by capturing long-range dependencies and contextual information; in computer vision [7]-[9], it allows models to focus on key regions within images to optimize feature extraction; and in recommender systems [10], [11], it improves the accuracy of capturing user behavior and preferences, thereby enhancing the effectiveness of personalized recommendations. Large-scale models such as GPT -4 [12] have further exploited the potential of self-attention, allowing them to address multimodal tasks such as visual question answering, image captioning, and cross-modal reasoning. These developments demonstrate that the self-attention mechanism is a fundamental mechanism Corresponding author: Qinglin Zhao.(e-mail: qlzhao@must.edu.mo) Fu Chen, Qinglin Zhao, Li Feng and Haitao Huang are with Faculty of Innovation Engineering, Macau University of Science and Technology, 999078, China.
arXiv.org Artificial Intelligence
Apr-7-2025