Learning when to observe: A frugal reinforcement learning framework for a high-cost world