Learning Policies in Partially Observable MDPs with Abstract Actions Using Value Iteration