Goto

Collaborating Authors

 qsc


HCQA: Hybrid Classical-Quantum Agent for Generating Optimal Quantum Sensor Circuits

Alomari, Ahmad, Kumar, Sathish A. P.

arXiv.org Artificial Intelligence

Abstract--This study proposes an HCQA for designing optimal Quantum Sensor Circuits (QSCs) to address complex quantum physics problems. The HCQA integrates computational intelligence techniques by leveraging a Deep Q-Network (DQN) for learning and policy optimization, enhanced by a quantum-based action selection mechanism based on the Q-values. Measurement of the circuit results in probabilistic action outcomes, allowing the agent to generate optimal QSCs by selecting sequences of gates that maximize the Quantum Fisher Information (QFI) while minimizing the number of gates. This computational intelligence-driven HCQA enables the automated generation of entangled quantum states, specifically the squeezed states, with high QFI sensitivity for quantum state estimation and control. This work highlights the synergy between AI-driven learning and quantum computation, illustrating how intelligent agents can autonomously discover optimal quantum circuit designs for enhanced sensing and estimation tasks. Impact Statement--The HCQA introduces a hybrid AIquantum framework for generating optimal QSCs, contributing to foundational advances in quantum metrology and intelligent quantum control. By integrating a DQN with quantum-based action selection, the HCQA learns to construct quantum circuits that achieve high QFI with reduced gate complexity. This approach demonstrates how reinforcement learning can guide quantum circuit synthesis in a goal-directed, data-efficient manner. While this work is demonstrated on a simplified two-qubit, noise-free simulation, it provides a proof of concept for how intelligent agents can autonomously learn and optimize QSCs. Technologically, this contributes to the growing field of Quantum Reinforcement Learning (QRL) and supports future exploration of scalable, noise-resilient extensions.


GPA: Grover Policy Agent for Generating Optimal Quantum Sensor Circuits

Alomari, Ahmad, Kumar, Sathish A. P.

arXiv.org Artificial Intelligence

This study proposes a GPA for designing optimal Quantum Sensor Circuits (QSCs) to address complex quantum physics problems. The GPA consists of two parts: the Quantum Policy Evaluation (QPE) and the Quantum Policy Improvement (QPI). The QPE performs phase estimation to generate the search space, while the QPI utilizes Grover search and amplitude amplification techniques to efficiently identify an optimal policy that generates optimal QSCs. The GPA generates QSCs by selecting sequences of gates that maximize the Quantum Fisher Information (QFI) while minimizing the number of gates. The QSCs generated by the GPA are capable of producing entangled quantum states, specifically the squeezed states. High QFI indicates increased sensitivity to parameter changes, making the circuit useful for quantum state estimation and control tasks. Evaluation of the GPA on a QSC that consists of two qubits and a sequence of R_x, R_y, and S gates demonstrates its efficiency in generating optimal QSCs with a QFI of 1. Compared to existing quantum agents, the GPA achieves higher QFI with fewer gates, demonstrating a more efficient and scalable approach to the design of QSCs. This work illustrates the potential computational power of quantum agents for solving quantum physics problems


Emphatic Algorithms for Deep Reinforcement Learning

Jiang, Ray, Zahavy, Tom, Xu, Zhongwen, White, Adam, Hessel, Matteo, Blundell, Charles, van Hasselt, Hado

arXiv.org Machine Learning

Off-policy learning allows us to learn about possible policies of behavior from experience generated by a different behavior policy. Temporal difference (TD) learning algorithms can become unstable when combined with function approximation and off-policy sampling - this is known as the ''deadly triad''. Emphatic temporal difference (ETD($\lambda$)) algorithm ensures convergence in the linear case by appropriately weighting the TD($\lambda$) updates. In this paper, we extend the use of emphatic methods to deep reinforcement learning agents. We show that naively adapting ETD($\lambda$) to popular deep reinforcement learning algorithms, which use forward view multi-step returns, results in poor performance. We then derive new emphatic algorithms for use in the context of such algorithms, and we demonstrate that they provide noticeable benefits in small problems designed to highlight the instability of TD methods. Finally, we observed improved performance when applying these algorithms at scale on classic Atari games from the Arcade Learning Environment.