Beyond Optimism: Exploration With Partially Observable Rewards