Efficient Reinforcement Learning of Task Planners for Robotic Palletization through Iterative Action Masking Learning