Reinforcement Learning
Export Reviews, Discussions, Author Feedback and Meta-Reviews
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper focuses on l_1 regularized multi-task feature RL by means of an integration between multi-task feature learning (MTFL) and Fitted Q-learning. Clarity: The paper is mostly well written. Regarding the format of this paper, the font size is not right. A suggestion on nuclear norm: The nuclear norm is usually represented as ||\cdot||_*, where in the paper it is notated as ||\cdot||_1. There is a mistake in Assumption 5. Judging from the context, I think line 291 is right and line 299 is mistakenly written, and thus the formulation in Equation (5, 6) are wrong, where U should be U^{-1}.
Bayesian Optimization for Iterative Learning Vu Nguyen
The performance of deep (reinforcement) learning systems crucially depends on the choice of hyperparameters. Their tuning is notoriously expensive, typically requiring an iterative training process to run for numerous steps to convergence. Traditional tuning algorithms only consider the final performance of hyperparam-eters acquired after many expensive iterations and ignore intermediate information from earlier training steps. In this paper, we present a Bayesian optimization (BO) approach which exploits the iterative structure of learning algorithms for efficient hyperparameter tuning. We propose to learn an evaluation function compressing learning progress at any stage of the training process into a single numeric score according to both training success and stability. Our BO framework is then balancing the benefit of assessing a hyperparameter setting over additional training steps against their computation cost. We further increase model efficiency by selectively including scores from different training steps for any evaluated hyper-parameter set. We demonstrate the efficiency of our algorithm by tuning hyperpa-rameters for the training of deep reinforcement learning agents and convolutional neural networks. Our algorithm outperforms all existing baselines in identifying optimal hyperparameters in minimal time.