A pragmatic policy learning approach to account for users' fatigue in repeated auctions