Generalizing distribution of partial rewards for multi-armed bandits with temporally-partitioned rewards

Open in new window