Quantum Reinforcement Learning by Adaptive Non-local Observables