Reinforcement Learningwith State Observation Costsin Action-Contingent Noiselessly Observable Markov Decision Processes