Ran, Shi-Ju
Universal scaling laws in quantum-probabilistic machine learning by tensor network towards interpreting representation and generalization powers
Bai, Sheng-Chen, Ran, Shi-Ju
Interpreting the representation and generalization powers has been a long-standing issue in the field of machine learning (ML) and artificial intelligence. This work contributes to uncovering the emergence of universal scaling laws in quantum-probabilistic ML. We take the generative tensor network (GTN) in the form of a matrix product state as an example and show that with an untrained GTN (such as a random TN state), the negative logarithmic likelihood (NLL) $L$ generally increases linearly with the number of features $M$, i.e., $L \simeq k M + const$. This is a consequence of the so-called ``catastrophe of orthogonality,'' which states that quantum many-body states tend to become exponentially orthogonal to each other as $M$ increases. We reveal that while gaining information through training, the linear scaling law is suppressed by a negative quadratic correction, leading to $L \simeq \beta M - \alpha M^2 + const$. The scaling coefficients exhibit logarithmic relationships with the number of training samples and the number of quantum channels $\chi$. The emergence of the quadratic correction term in NLL for the testing (training) set can be regarded as evidence of the generalization (representation) power of GTN. Over-parameterization can be identified by the deviation in the values of $\alpha$ between training and testing sets while increasing $\chi$. We further investigate how orthogonality in the quantum feature map relates to the satisfaction of quantum probabilistic interpretation, as well as to the representation and generalization powers of GTN. The unveiling of universal scaling laws in quantum-probabilistic ML would be a valuable step toward establishing a white-box ML scheme interpreted within the quantum probabilistic framework.
Universal replication of chaotic characteristics by classical and quantum machine learning
Bai, Sheng-Chen, Ran, Shi-Ju
Replicating chaotic characteristics of non-linear dynamics by machine learning (ML) has recently drawn wide attentions. In this work, we propose that a ML model, trained to predict the state one-step-ahead from several latest historic states, can accurately replicate the bifurcation diagram and the Lyapunov exponents of discrete dynamic systems. The characteristics for different values of the hyper-parameters are captured universally by a single ML model, while the previous works considered training the ML model independently by fixing the hyper-parameters to be specific values. Our benchmarks on the one- and two-dimensional Logistic maps show that variational quantum circuit can reproduce the long-term characteristics with higher accuracy than the long short-term memory (a well-recognized classical ML model). Our work reveals an essential difference between the ML for the chaotic characteristics and that for standard tasks, from the perspective of the relation between performance and model complexity. Our results suggest that quantum circuit model exhibits potential advantages on mitigating over-fitting, achieving higher accuracy and stability.
Tensor networks for interpretable and efficient quantum-inspired machine learning
Ran, Shi-Ju, Su, Gang
It is a critical challenge to simultaneously gain high interpretability and efficiency with the current schemes of deep machine learning (ML). Tensor network (TN), which is a well-established mathematical tool originating from quantum mechanics, has shown its unique advantages on developing efficient ``white-box'' ML schemes. Here, we give a brief review on the inspiring progresses made in TN-based ML. On one hand, interpretability of TN ML is accommodated with the solid theoretical foundation based on quantum information and many-body physics. On the other hand, high efficiency can be rendered from the powerful TN representations and the advanced computational techniques developed in quantum many-body physics. With the fast development on quantum computers, TN is expected to conceive novel schemes runnable on quantum hardware, heading towards the ``quantum artificial intelligence'' in the forthcoming future.
Persistent Ballistic Entanglement Spreading with Optimal Control in Quantum Spin Chains
Lu, Ying, Shi, Pei, Wang, Xiao-Han, Hu, Jie, Ran, Shi-Ju
Entanglement propagation provides a key routine to understand quantum many-body dynamics in and out of equilibrium. In this work, we uncover that the ``variational entanglement-enhancing'' field (VEEF) robustly induces a persistent ballistic spreading of entanglement in quantum spin chains. The VEEF is time dependent, and is optimally controlled to maximize the bipartite entanglement entropy (EE) of the final state. Such a linear growth persists till the EE reaches the genuine saturation $\tilde{S} = - \log_{2} 2^{-\frac{N}{2}}=\frac{N}{2}$ with $N$ the total number of spins. The EE satisfies $S(t) = v t$ for the time $t \leq \frac{N}{2v}$, with $v$ the velocity. These results are in sharp contrast with the behaviors without VEEF, where the EE generally approaches a sub-saturation known as the Page value $\tilde{S}_{P} =\tilde{S} - \frac{1}{2\ln{2}}$ in the long-time limit, and the entanglement growth deviates from being linear before the Page value is reached. The dependence between the velocity and interactions is explored, with $v \simeq 2.76$, $4.98$, and $5.75$ for the spin chains with Ising, XY, and Heisenberg interactions, respectively. We further show that the nonlinear growth of EE emerges with the presence of long-range interactions.
Quantum compiling with a variational instruction set for accurate and fast quantum computing
Lu, Ying, Zhou, Peng-Fei, Fei, Shao-Ming, Ran, Shi-Ju
The quantum instruction set (QIS) is defined as the quantum gates that are physically realizable by controlling the qubits in quantum hardware. Compiling quantum circuits into the product of the gates in a properly defined QIS is a fundamental step in quantum computing. We here propose the quantum variational instruction set (QuVIS) formed by flexibly designed multi-qubit gates for higher speed and accuracy of quantum computing. The controlling of qubits for realizing the gates in a QuVIS is variationally achieved using the fine-grained time optimization algorithm. Significant reductions in both the error accumulation and time cost are demonstrated in realizing the swaps of multiple qubits and quantum Fourier transformations, compared with the compiling by a standard QIS such as the quantum microinstruction set (QuMIS, formed by several one- and two-qubit gates including one-qubit rotations and controlled-NOT gates). With the same requirement on quantum hardware, the time cost for QuVIS is reduced to less than one half of that for QuMIS. Simultaneously, the error is suppressed algebraically as the depth of the compiled circuit is reduced. As a general compiling approach with high flexibility and efficiency, QuVIS can be defined for different quantum circuits and be adapted to the quantum hardware with different interactions.
Compressing neural network by tensor network with exponentially fewer variational parameters
Qing, Yong, Zhou, Peng-Fei, Li, Ke, Ran, Shi-Ju
Neural network (NN) designed for challenging machine learning tasks is in general a highly nonlinear mapping that contains massive variational parameters. High complexity of NN, if unbounded or unconstrained, might unpredictably cause severe issues including over-fitting, loss of generalization power, and unbearable cost of hardware. In this work, we propose a general compression scheme that significantly reduces the variational parameters of NN by encoding them to multi-layer tensor networks (TN's) that contain exponentially-fewer free parameters. Superior compression performance of our scheme is demonstrated on several widely-recognized NN's (FC-2, LeNet-5, and VGG-16) and datasets (MNIST and CIFAR-10), surpassing the state-of-the-art method based on shallow tensor networks. For instance, about 10 million parameters in the three convolutional layers of VGG-16 are compressed in TN's with just $632$ parameters, while the testing accuracy on CIFAR-10 is surprisingly improved from $81.14\%$ by the original NN to $84.36\%$ after compression. Our work suggests TN as an exceptionally efficient mathematical structure for representing the variational parameters of NN's, which superiorly exploits the compressibility than the simple multi-way arrays.
Intelligent diagnostic scheme for lung cancer screening with Raman spectra data by tensor network machine learning
An, Yu-Jia, Bai, Sheng-Chen, Cheng, Lin, Li, Xiao-Guang, Wang, Cheng-en, Han, Xiao-Dong, Su, Gang, Ran, Shi-Ju, Wang, Cong
Artificial intelligence (AI) has brought tremendous impacts on biomedical sciences from academic researches to clinical applications, such as in biomarkers' detection and diagnosis, optimization of treatment, and identification of new therapeutic targets in drug discovery. However, the contemporary AI technologies, particularly deep machine learning (ML), severely suffer from non-interpretability, which might uncontrollably lead to incorrect predictions. Interpretability is particularly crucial to ML for clinical diagnosis as the consumers must gain necessary sense of security and trust from firm grounds or convincing interpretations. In this work, we propose a tensor-network (TN)-ML method to reliably predict lung cancer patients and their stages via screening Raman spectra data of Volatile organic compounds (VOCs) in exhaled breath, which are generally suitable as biomarkers and are considered to be an ideal way for non-invasive lung cancer screening. The prediction of TN-ML is based on the mutual distances of the breath samples mapped to the quantum Hilbert space. Thanks to the quantum probabilistic interpretation, the certainty of the predictions can be quantitatively characterized. The accuracy of the samples with high certainty is almost 100$\%$. The incorrectly-classified samples exhibit obviously lower certainty, and thus can be decipherably identified as anomalies, which will be handled by human experts to guarantee high reliability. Our work sheds light on shifting the ``AI for biomedical sciences'' from the conventional non-interpretable ML schemes to the interpretable human-ML interactive approaches, for the purpose of high accuracy and reliability.
Residual Matrix Product State for Machine Learning
Meng, Ye-Ming, Zhang, Jing, Zhang, Peng, Gao, Chao, Ran, Shi-Ju
Tensor network, which originates from quantum physics, is emerging as an efficient tool for classical and quantum machine learning. Nevertheless, there still exists a considerable accuracy gap between tensor network and the sophisticated neural network models for classical machine learning. In this work, we combine the ideas of matrix product state (MPS), the simplest tensor network structure, and residual neural network and propose the residual matrix product state (ResMPS). The ResMPS can be treated as a network where its layers map the "hidden" features to the outputs (e.g., classifications), and the variational parameters of the layers are the functions of the features of the samples (e.g., pixels of images). This is different from neural network, where the layers map feed-forwardly the features to the output. The ResMPS can equip with the non-linear activations and dropout layers, and outperforms the state-of-the-art tensor network models in terms of efficiency, stability, and expression power. Besides, ResMPS is interpretable from the perspective of polynomial expansion, where the factorization and exponential machines naturally emerge. Our work contributes to connecting and hybridizing neural and tensor networks, which is crucial to further enhance our understand of the working mechanisms and improve the performance of both models.
Quantum Compressed Sensing with Unsupervised Tensor Network Machine Learning
Ran, Shi-Ju, Sun, Zheng-Zhi, Fei, Shao-Ming, Su, Gang, Lewenstein, Maciej
We propose tensor-network compressed sensing (TNCS) for compressing and communicating classical information via the quantum states trained by the unsupervised tensor network (TN) machine learning. The main task of TNCS is to reconstruct as accurately as possible the full classical information from a generative TN state, by knowing as small part of the classical information as possible. In the applications to the datasets of hand-written digits and fashion images, we train the generative TN (matrix product state) by the training set, and show that the images in the testing set can be reconstructed from a small number of pixels. Related issues including the applications of TNCS to quantum encrypted communication are discussed.
Generative Tensor Network Classification Model for Supervised Machine Learning
Sun, Zheng-Zhi, Peng, Cheng, Liu, Ding, Ran, Shi-Ju, Su, Gang
Tensor network (TN) has recently triggered extensive interests in developing machine-learning models in quantum many-body Hilbert space. Here we purpose a generative TN classification (GTNC) approach for supervised learning. The strategy is to train the generative TN for each class of the samples to construct the classifiers. The classification is implemented by comparing the distance in the many-body Hilbert space. The numerical experiments by GTNC show impressive performance on the MNIST and Fashion-MNIST dataset. The testing accuracy is competitive to the state-of-the-art convolutional neural network while higher than the naive Bayes classifier (a generative classifier) and support vector machine. Moreover, GTNC is more efficient than the existing TN models that are in general discriminative. By investigating the distances in the many-body Hilbert space, we find that (a) the samples are naturally clustering in such a space; and (b) bounding the bond dimensions of the TN's to finite values corresponds to removing redundant information in the image recognition. These two characters make GTNC an adaptive and universal model of excellent performance.