Recent advancements in quantum computing have driven the scientific community's quest to solve a certain class of complex problems for which quantum computers would be better suited than traditional supercomputers. To improve the efficiency with which quantum computers can solve these problems, scientists are investigating the use of artificial intelligence approaches. In a new study, scientists at the U.S. Department of Energy's (DOE) Argonne National Laboratory have developed a new algorithm based on reinforcement learning to find the optimal parameters for the Quantum Approximate Optimization Algorithm (QAOA), which allows a quantum computer to solve certain combinatorial problems such as those that arise in materials design, chemistry and wireless communications. "It's a bit like having a self-driving car in traffic; the algorithm can detect when it needs to make adjustments in the'dials' it uses to do the computation." "Combinatorial optimization problems are those for which the solution space gets exponentially larger as you expand the number of decision variables," said Argonne computer scientist Prasanna Balaprakash.
Quantum computing is a computational paradigm with the potential to outperform classical methods for a variety of problems. Proposed recently, the Quantum Approximate Optimization Algorithm (QAOA) is considered as one of the leading candidates for demonstrating quantum advantage in the near term. QAOA is a variational hybrid quantum-classical algorithm for approximately solving combinatorial optimization problems. The quality of the solution obtained by QAOA for a given problem instance depends on the performance of the classical optimizer used to optimize the variational parameters. In this paper, we formulate the problem of finding optimal QAOA parameters as a learning task in which the knowledge gained from solving training instances can be leveraged to find high-quality solutions for unseen test instances. To this end, we develop two machine-learning-based approaches. Our first approach adopts a reinforcement learning (RL) framework to learn a policy network to optimize QAOA circuits. Our second approach adopts a kernel density estimation (KDE) technique to learn a generative model of optimal QAOA parameters. In both approaches, the training procedure is performed on small-sized problem instances that can be simulated on a classical computer; yet the learned RL policy and the generative model can be used to efficiently solve larger problems. Extensive simulations using the IBM Qiskit Aer quantum circuit simulator demonstrate that our proposed RL- and KDE-based approaches reduce the optimality gap by factors up to 30.15 when compared with other commonly used off-the-shelf optimizers.
Quantum computing exploits basic quantum phenomena such as state superposition and entanglement to perform computations. The Quantum Approximate Optimization Algorithm (QAOA) is arguably one of the leading quantum algorithms that can outperform classical state-of-the-art methods in the near term. QAOA is a hybrid quantum-classical algorithm that combines a parameterized quantum state evolution with a classical optimization routine to approximately solve combinatorial problems. The quality of the solution obtained by QAOA within a fixed budget of calls to the quantum computer depends on the performance of the classical optimization routine used to optimize the variational parameters. In this work, we propose an approach based on reinforcement learning (RL) to train a policy network that can be used to quickly find high-quality variational parameters for unseen combinatorial problem instances. The RL agent is trained on small problem instances which can be simulated on a classical computer, yet the learned RL policy is generalizable and can be used to efficiently solve larger instances. Extensive simulations using the IBM Qiskit Aer quantum circuit simulator demonstrate that our trained RL policy can reduce the optimality gap by a factor up to 8.61 compared with other off-the-shelf optimizers tested.
In recent years, quantum devices have become available that enable researchers--for the first time--to use real quantum hardware to begin to solve scientific problems. However, in the near term, the number and quality of qubits (the basic unit of quantum information) for quantum computers are expected to remain limited, making it difficult to use these machines for practical applications. A hybrid quantum and classical approach may be the answer to tackling this problem with existing quantum hardware. Researchers at the U.S. Department of Energy's (DOE) Argonne National Laboratory and Los Alamos National Laboratory, along with researchers at Clemson University and Fujitsu Laboratories of America, have developed hybrid algorithms to run on quantum machines and have demonstrated them for practical applications using IBM quantum computers (see below for description of Argonne's role in the IBM Q Hub at Oak Ridge National Laboratory [ORNL]) and a D-Wave quantum computer. "This approach will enable researchers to use near-term quantum computers to solve applications that support the DOE mission. For example, it can be applied to find community structures in metabolic networks or a microbiome," says Yuri Alexeev, principal project specialist, Computational Science division The team's work is presented in an article entitled "A Hybrid Approach for Solving Optimization Problems on Small Quantum Computers" that appears in the June 2019 issue of the Institute of Electrical and Electronics Engineers (IEEE) Computer Magazine.
Quantum hardware and quantum-inspired algorithms are becoming increasingly popular for combinatorial optimization. However, these algorithms may require careful hyperparameter tuning for each problem instance. We use a reinforcement learning agent in conjunction with a quantum-inspired algorithm to solve the Ising energy minimization problem, which is equivalent to the Maximum Cut problem. The agent controls the algorithm by tuning one of its parameters with the goal of improving recently seen solutions. We propose a new Rescaled Ranked Reward (R3) method that enables stable single-player version of self-play training that helps the agent to escape local optima. The training on any problem instance can be accelerated by applying transfer learning from an agent trained on randomly generated problems. Our approach allows sampling high-quality solutions to the Ising problem with high probability and outperforms both baseline heuristics and a black-box hyperparameter optimization approach.