Fitted Q-iteration in continuous action-space MDPs