Quantum Natural Policy Gradients: Towards Sample-Efficient Reinforcement Learning