Reviews: Regret Bounds for Online Portfolio Selection with a Cardinality Constraint

Neural Information Processing Systems 

Summary The paper studies the online portfolio selection problem under cardinality constraints and provides two algorithms that achieve sublinear regret. One algorithm handles the full information setting and the other algorithm handles the bandit feedback setting. Furthermore, the paper provides lower bounds for both the full information and bandit feedback settings. The approach that both algorithms take is to split the problem into two learning problems. One problem is to learn the optimal combination of assets and the other problem is to learn the optimal portfolio. To learn the optimal combination of assets a version of either the multiplicative weights algorithm (full information) or exp3 (bandit feedback) is used.