Plinge, Axel
Guided-SPSA: Simultaneous Perturbation Stochastic Approximation assisted by the Parameter Shift Rule
Periyasamy, Maniraman, Plinge, Axel, Mutschler, Christopher, Scherer, Daniel D., Mauerer, Wolfgang
The study of variational quantum algorithms (VQCs) has received significant attention from the quantum computing community in recent years. These hybrid algorithms, utilizing both classical and quantum components, are well-suited for noisy intermediate-scale quantum devices. Though estimating exact gradients using the parameter-shift rule to optimize the VQCs is realizable in NISQ devices, they do not scale well for larger problem sizes. The computational complexity, in terms of the number of circuit evaluations required for gradient estimation by the parameter-shift rule, scales linearly with the number of parameters in VQCs. On the other hand, techniques that approximate the gradients of the VQCs, such as the simultaneous perturbation stochastic approximation (SPSA), do not scale with the number of parameters but struggle with instability and often attain suboptimal solutions. In this work, we introduce a novel gradient estimation approach called Guided-SPSA, which meaningfully combines the parameter-shift rule and SPSA-based gradient approximation. The Guided-SPSA results in a 15% to 25% reduction in the number of circuit evaluations required during training for a similar or better optimality of the solution found compared to the parameter-shift rule. The Guided-SPSA outperforms standard SPSA in all scenarios and outperforms the parameter-shift rule in scenarios such as suboptimal initialization of the parameters. We demonstrate numerically the performance of Guided-SPSA on different paradigms of quantum machine learning, such as regression, classification, and reinforcement learning.
Warm-Start Variational Quantum Policy Iteration
Meyer, Nico, Murauer, Jakob, Popov, Alexander, Ufrecht, Christian, Plinge, Axel, Mutschler, Christopher, Scherer, Daniel D.
Reinforcement learning is a powerful framework aiming to determine optimal behavior in highly complex decision-making scenarios. This objective can be achieved using policy iteration, which requires to solve a typically large linear system of equations. We propose the variational quantum policy iteration (VarQPI) algorithm, realizing this step with a NISQ-compatible quantum-enhanced subroutine. Its scalability is supported by an analysis of the structure of generic reinforcement learning environments, laying the foundation for potential quantum advantage with utility-scale quantum computers. Furthermore, we introduce the warm-start initialization variant (WS-VarQPI) that significantly reduces resource overhead. The algorithm solves a large FrozenLake environment with an underlying 256x256-dimensional linear system, indicating its practical robustness.
Comprehensive Library of Variational LSE Solvers
Meyer, Nico, Rรถhn, Martin, Murauer, Jakob, Plinge, Axel, Mutschler, Christopher, Scherer, Daniel D.
Linear systems of equations can be found in various mathematical domains, as well as in the field of machine learning. By employing noisy intermediate-scale quantum devices, variational solvers promise to accelerate finding solutions for large systems. Although there is a wealth of theoretical research on these algorithms, only fragmentary implementations exist. To fill this gap, we have developed the variational-lse-solver framework, which realizes existing approaches in literature, and introduces several enhancements. The user-friendly interface is designed for researchers that work at the abstraction level of identifying and developing end-to-end applications.
Qiskit-Torch-Module: Fast Prototyping of Quantum Neural Networks
Meyer, Nico, Ufrecht, Christian, Periyasamy, Maniraman, Plinge, Axel, Mutschler, Christopher, Scherer, Daniel D., Maier, Andreas
Quantum computer simulation software is an integral tool for the research efforts in the quantum computing community. An important aspect is the efficiency of respective frameworks, especially for training variational quantum algorithms. Focusing on the widely used Qiskit software environment, we develop the qiskit-torch-module. It improves runtime performance by two orders of magnitude over comparable libraries, while facilitating low-overhead integration with existing codebases. Moreover, the framework provides advanced tools for integrating quantum neural networks with PyTorch. The pipeline is tailored for single-machine compute systems, which constitute a widely employed setup in day-to-day research efforts.
C-MCTS: Safe Planning with Monte Carlo Tree Search
Parthasarathy, Dinesh, Kontes, Georgios, Plinge, Axel, Mutschler, Christopher
The Constrained Markov Decision Process (CMDP) formulation allows to solve safety-critical decision making tasks that are subject to constraints. While CMDPs have been extensively studied in the Reinforcement Learning literature, little attention has been given to sampling-based planning algorithms such as MCTS for solving them. Previous approaches perform conservatively with respect to costs as they avoid constraint violations by using Monte Carlo cost estimates that suffer from high variance. We propose Constrained MCTS (C-MCTS), which estimates cost using a safety critic that is trained with Temporal Difference learning in an offline phase prior to agent deployment. The critic limits exploration by pruning unsafe trajectories within MCTS during deployment. C-MCTS satisfies cost constraints but operates closer to the constraint boundary, achieving higher rewards than previous work. As a nice byproduct, the planner is more efficient w.r.t. planning steps. Most importantly, under model mismatch between the planner and the real world, C-MCTS is less susceptible to cost violations than previous work.
Quantum Natural Policy Gradients: Towards Sample-Efficient Reinforcement Learning
Meyer, Nico, Scherer, Daniel D., Plinge, Axel, Mutschler, Christopher, Hartmann, Michael J.
Reinforcement learning is a growing field in AI with a lot of potential. Intelligent behavior is learned automatically through trial and error in interaction with the environment. However, this learning process is often costly. Using variational quantum circuits as function approximators potentially can reduce this cost. In order to implement this, we propose the quantum natural policy gradient (QNPG) algorithm -- a second-order gradient-based routine that takes advantage of an efficient approximation of the quantum Fisher information matrix. We experimentally demonstrate that QNPG outperforms first-order based training on Contextual Bandits environments regarding convergence speed and stability and moreover reduces the sample complexity. Furthermore, we provide evidence for the practical feasibility of our approach by training on a 12-qubit hardware device.
Quantum Policy Gradient Algorithm with Optimized Action Decoding
Meyer, Nico, Scherer, Daniel D., Plinge, Axel, Mutschler, Christopher, Hartmann, Michael J.
Quantum machine learning implemented by variational quantum circuits (VQCs) is considered a promising concept for the noisy intermediate-scale quantum computing era. Focusing on applications in quantum reinforcement learning, we propose a specific action decoding procedure for a quantum policy gradient approach. We introduce a novel quality measure that enables us to optimize the classical post-processing required for action selection, inspired by local and global quantum measurements. The resulting algorithm demonstrates a significant performance improvement in several benchmark environments. With this technique, we successfully execute a full training routine on a 5-qubit hardware device. Our method introduces only negligible classical overhead and has the potential to improve VQC-based algorithms beyond the field of quantum reinforcement learning.
Batch Quantum Reinforcement Learning
Periyasamy, Maniraman, Hรถlle, Marc, Wiedmann, Marco, Scherer, Daniel D., Plinge, Axel, Mutschler, Christopher
Training DRL agents is often a time-consuming process as a large number of samples and environment interactions is required. This effect is even amplified in the case of Batch RL, where the agent is trained without environment interactions solely based on a set of previously collected data. Novel approaches based on quantum computing suggest an advantage compared to classical approaches in terms of sample efficiency. To investigate this advantage, we propose a batch RL algorithm leveraging VQC as function approximators in the discrete BCQ algorithm. Additionally, we present a novel data re-uploading scheme based on cyclically shifting the input variables' order in the data encoding layers. We show the efficiency of our algorithm on the OpenAI CartPole environment and compare its performance to classical neural network-based discrete BCQ.
An Empirical Comparison of Optimizers for Quantum Machine Learning with SPSA-based Gradients
Wiedmann, Marco, Hรถlle, Marc, Periyasamy, Maniraman, Meyer, Nico, Ufrecht, Christian, Scherer, Daniel D., Plinge, Axel, Mutschler, Christopher
VQA have attracted a lot of attention from the quantum computing community for the last few years. Their hybrid quantum-classical nature with relatively shallow quantum circuits makes them a promising platform for demonstrating the capabilities of NISQ devices. Although the classical machine learning community focuses on gradient-based parameter optimization, finding near-exact gradients for VQC with the parameter-shift rule introduces a large sampling overhead. Therefore, gradient-free optimizers have gained popularity in quantum machine learning circles. Among the most promising candidates is the SPSA algorithm, due to its low computational cost and inherent noise resilience. We introduce a novel approach that uses the approximated gradient from SPSA in combination with state-of-the-art gradient-based classical optimizers. We demonstrate numerically that this outperforms both standard SPSA and the parameter-shift rule in terms of convergence rate and absolute error in simple regression tasks. The improvement of our novel approach over SPSA with stochastic gradient decent is even amplified when shot- and hardware-noise are taken into account. We also demonstrate that error mitigation does not significantly affect our results.
A Survey on Quantum Reinforcement Learning
Meyer, Nico, Ufrecht, Christian, Periyasamy, Maniraman, Scherer, Daniel D., Plinge, Axel, Mutschler, Christopher
With recent advances in the fabrication and control of hardware for quantum information processing, the possibilities of merging quantum computing (QC) with machine learning (ML) have received a huge amount of attention within the growing research community. Hereby, reinforcement learning (RL) is the third paradigm besides supervised and unsupervised learning. In this survey article, we provide an overview over so-called quantum reinforcement learning (QRL) algorithms. We understand these as quantum-assisted approaches, that solve a particular task (be they classical or quantum in nature) by employing quantum resources (either in simulation and/or in experiment). In order to keep this contribution as self-contained as possible, we provide the necessary backgrounds before venturing into the QRL literature. We start out with a brief recap of the essentials of the RL paradigm in the fully classical setting in Sec. 2. Further, in Sec. 3 we provide a quick introduction to QC and variational quantum circuits (VQCs). Readers familiar with either of the topics may safely skip these sections. In Sec. 4 we turn our attention to the emerging field of QRL, starting out with a quick overview of the literature.