Reviews: Visualizing the PHATE of Neural Networks
–Neural Information Processing Systems
Update after author response: Taking on faith the results the authors report in their author response (namely ability to identify generalization performance using only the training set, results on CIFAR10 and white noise datasets, and the quantitative evaluation of the task-switching), I would raise my score to a 6 (actually if they did achieve everything they claimed in the author response, I would be inclined to give it a 7, but I'd need to see all the results for that). Originality: I think the originality is fairly high. Although the PHATE algorithm exists in the literature, the Multislice kernel is novel, and the idea of visualizing the learning dynamics of the hidden neurons to ascertain things like catastrophic forgetting or poor generalization is (to my knowledge) novel. Quality: I think the Experiments sections could be substantially improved: (1) For the experiments on continual learning, from looking at Figure 3 it is not obvious to me that Adagrad does better than Rehearsal for the "Domain" learning setting, or that Adagrad outperforms Adam at class learning. Adam apparently does the best at task learning, but again, I wouldn't have guessed from the trajectories.
Neural Information Processing Systems
Jan-26-2025, 14:52:53 GMT