Plotting

 Zhao, Qinglin


Quantum Complex-Valued Self-Attention Model

arXiv.org Artificial Intelligence

--Self-attention has revolutionized classical machine learning, yet existing quantum self-attention models underuti-lize quantum states' potential due to oversimplified or incomplete mechanisms. T o address this limitation, we introduce the Quantum Complex-V alued Self-Attention Model (QCSAM), the first framework to leverage complex-valued similarities, which captures amplitude and phase relationships between quantum states more comprehensively. T o achieve this, QCSAM extends the Linear Combination of Unitaries (LCUs) into the Complex LCUs (CLCUs) framework, enabling precise complex-valued weighting of quantum states and supporting quantum multi-head attention. Experiments on MNIST and Fashion-MNIST show that QCSAM outperforms recent quantum self-attention models, including QKSAN, QSAN, and GQHAN. With only 4 qubits, QCSAM achieves 100% and 99.2% test accuracies on MNIST and Fashion-MNIST, respectively. Furthermore, we evaluate scalability across 3-8 qubits and 2-4 class tasks, while ablation studies validate the advantages of complex-valued attention weights over real-valued alternatives. I NTRODUCTION The self-attention mechanism, as a key component of deep learning architectures, has significantly impacted the ways in which data is processed and features are learned [1]-[3]. By generating adaptive attention weights, self-attention not only highlights key features in the data but also integrates global contextual information, thereby improving the expressive power and computational efficiency of deep learning systems. For instance, in natural language processing [4]-[6], self-attention has enhanced language understanding and generation by capturing long-range dependencies and contextual information; in computer vision [7]-[9], it allows models to focus on key regions within images to optimize feature extraction; and in recommender systems [10], [11], it improves the accuracy of capturing user behavior and preferences, thereby enhancing the effectiveness of personalized recommendations. Large-scale models such as GPT -4 [12] have further exploited the potential of self-attention, allowing them to address multimodal tasks such as visual question answering, image captioning, and cross-modal reasoning. These developments demonstrate that the self-attention mechanism is a fundamental mechanism Corresponding author: Qinglin Zhao.(e-mail: qlzhao@must.edu.mo) Fu Chen, Qinglin Zhao, Li Feng and Haitao Huang are with Faculty of Innovation Engineering, Macau University of Science and Technology, 999078, China.


Quantum Mixed-State Self-Attention Network

arXiv.org Artificial Intelligence

The rapid advancement of quantum computing has increasingly highlighted its potential in the realm of machine learning, particularly in the context of natural language processing (NLP) tasks. Quantum machine learning (QML) leverages the unique capabilities of quantum computing to offer novel perspectives and methodologies for complex data processing and pattern recognition challenges. This paper introduces a novel Quantum Mixed-State Attention Network (QMSAN), which integrates the principles of quantum computing with classical machine learning algorithms, especially self-attention networks, to enhance the efficiency and effectiveness in handling NLP tasks. QMSAN model employs a quantum attention mechanism based on mixed states, enabling efficient direct estimation of similarity between queries and keys within the quantum domain, leading to more effective attention weight acquisition. Additionally, we propose an innovative quantum positional encoding scheme, implemented through fixed quantum gates within the quantum circuit, to enhance the model's accuracy. Experimental validation on various datasets demonstrates that QMSAN model outperforms existing quantum and classical models in text classification, achieving significant performance improvements. QMSAN model not only significantly reduces the number of parameters but also exceeds classical self-attention networks in performance, showcasing its strong capability in data representation and information extraction. Furthermore, our study investigates the model's robustness in different quantum noise environments, showing that QMSAN possesses commendable robustness to low noise.


Deep Joint Source-Channel Coding for Efficient and Reliable Cross-Technology Communication

arXiv.org Artificial Intelligence

Cross-technology communication (CTC) is a promising technique that enables direct communications among incompatible wireless technologies without needing hardware modification. However, it has not been widely adopted in real-world applications due to its inefficiency and unreliability. To address this issue, this paper proposes a deep joint source-channel coding (DJSCC) scheme to enable efficient and reliable CTC. The proposed scheme builds a neural-network-based encoder and decoder at the sender side and the receiver side, respectively, to achieve two critical tasks simultaneously: 1) compressing the messages to the point where only their essential semantic meanings are preserved; 2) ensuring the robustness of the semantic meanings when they are transmitted across incompatible technologies. The scheme incorporates existing CTC coding algorithms as domain knowledge to guide the encoder-decoder pair to learn the characteristics of CTC links better. Moreover, the scheme constructs shared semantic knowledge for the encoder and decoder, allowing semantic meanings to be converted into very few bits for cross-technology transmissions, thus further improving the efficiency of CTC. Extensive simulations verify that the proposed scheme can reduce the transmission overhead by up to 97.63\% and increase the structural similarity index measure by up to 734.78%, compared with the state-of-the-art CTC scheme.


Quantum Generative Diffusion Model

arXiv.org Artificial Intelligence

This paper introduces the Quantum Generative Diffusion Model (QGDM), a fully quantum-mechanical model for generating quantum state ensembles, inspired by Denoising Diffusion Probabilistic Models. QGDM features a diffusion process that introduces timestep-dependent noise into quantum states, paired with a denoising mechanism trained to reverse this contamination. This model efficiently evolves a completely mixed state into a target quantum state post-training. Our comparative analysis with Quantum Generative Adversarial Networks demonstrates QGDM's superiority, with fidelity metrics exceeding 0.99 in numerical simulations involving up to 4 qubits. Additionally, we present a Resource-Efficient version of QGDM (RE-QGDM), which minimizes the need for auxiliary qubits while maintaining impressive generative capabilities for tasks involving up to 8 qubits. These results showcase the proposed models' potential for tackling challenging quantum generation problems.


A Sparse Cross Attention-based Graph Convolution Network with Auxiliary Information Awareness for Traffic Flow Prediction

arXiv.org Artificial Intelligence

Deep graph convolution networks (GCNs) have recently shown excellent performance in traffic prediction tasks. However, they face some challenges. First, few existing models consider the influence of auxiliary information, i.e., weather and holidays, which may result in a poor grasp of spatial-temporal dynamics of traffic data. Second, both the construction of a dynamic adjacent matrix and regular graph convolution operations have quadratic computation complexity, which restricts the scalability of GCN-based models. To address such challenges, this work proposes a deep encoder-decoder model entitled AIMSAN. It contains an auxiliary information-aware module (AIM) and sparse cross attention-based graph convolution network (SAN). The former learns multi-attribute auxiliary information and obtains its embedded presentation of different time-window sizes. The latter uses a cross-attention mechanism to construct dynamic adjacent matrices by fusing traffic data and embedded auxiliary data. Then, SAN applies diffusion GCN on traffic data to mine rich spatial-temporal dynamics. Furthermore, AIMSAN considers and uses the spatial sparseness of traffic nodes to reduce the quadratic computation complexity. Experimental results on three public traffic datasets demonstrate that the proposed method outperforms other counterparts in terms of various performance indices. Specifically, the proposed method has competitive performance with the state-of-the-art algorithms but saves 35.74% of GPU memory usage, 42.25% of training time, and 45.51% of validation time on average.


MODMA dataset: a Multi-modal Open Dataset for Mental-disorder Analysis

arXiv.org Artificial Intelligence

According to the World Health Organization, the number of mental disorder patients, especially depression patients, has grown rapidly and become a leading contributor to the global burden of disease. However, the present common practice of depression diagnosis is based on interviews and clinical scales carried out by doctors, which is not only labor-consuming but also time-consuming. One important reason is due to the lack of physiological indicators for mental disorders. With the rising of tools such as data mining and artificial intelligence, using physiological data to explore new possible physiological indicators of mental disorder and creating new applications for mental disorder diagnosis has become a new research hot topic. However, good quality physiological data for mental disorder patients are hard to acquire. We present a multi-modal open dataset for mental-disorder analysis. The dataset includes EEG and audio data from clinically depressed patients and matching normal controls. All our patients were carefully diagnosed and selected by professional psychiatrists in hospitals. The EEG dataset includes not only data collected using traditional 128-electrodes mounted elastic cap, but also a novel wearable 3-electrode EEG collector for pervasive applications. The 128-electrodes EEG signals of 53 subjects were recorded as both in resting state and under stimulation; the 3-electrode EEG signals of 55 subjects were recorded in resting state; the audio data of 52 subjects were recorded during interviewing, reading, and picture description. We encourage other researchers in the field to use it for testing their methods of mental-disorder analysis.