References [1 ] Qiskit: Anopen-sourceframeworkforquantumcomputing,2019

Feb-10-2026, 02:59:28 GMT–Neural Information Processing Systems

If during an entire episode of placing L gates the threshold ξ was never reached a reward of 5 is issued. The extreme reward values 5 are crucial for the performanceoftheagent. Given this figure of merit, a circuit with a smaller number of gates yields a higher discounted sum of rewards. This could be achieved, e.g., by using automated postprocessing methods to optimize the circuits (e.g. a Qiskit Terra transpiler [1]). For instance, the vast majority of rotations gates used by the agent are RY gates, in all cases we analyzed.

architecture, emin, qiskit, (2 more...)

Neural Information Processing Systems

Feb-10-2026, 02:59:28 GMT

Conferences PDF

Add feedback