Review for NeurIPS paper: On Warm-Starting Neural Network Training
–Neural Information Processing Systems
Weaknesses: The paper is limited to evaluating on CIFAR/SVHN, and I worry that this phenomenon may not extend to other methods and tasks. Warm-starting .. in the context of the problem setup of the authors .. seems to be basically the same thing as fine-tuning with more-data. This phenomenon doesn't seem to be happening on more sophisticated computer-vision tasks, and finetuning from datasets like ImageNet leads to similar or better performance with much faster convergence. Although the label-space is different in many fine-tuning setups one can imagine extending the existing setup to cover common and more realistic problems. The paper is written to motivate the idea of re-using weights on for continual/online learning setting but splitting the datasets into 2 sets (training with 1 and fine-tuning with both) seems to me a little toyish and unconventional continual learning setting. In online / continual learning there is a distribution shift as the dataset enters, but the dataset seems to be randomly split meaning that on expectation the distribution of these 2 sets should be the same.
Neural Information Processing Systems
Jan-22-2025, 20:48:46 GMT
- Technology: