Reinforcement Learning of Flexible Policies for Symbolic Instructions with Adjustable Mapping Specifications