Review for NeurIPS paper: Supervised Contrastive Learning
–Neural Information Processing Systems
Weaknesses: I have some concerns regarding the training cost. Since the proposed method uses a "multiviewed batch" which is 2x the standard batch used by cross-entropy loss, its training cost is 2x the baseline. Using more compute cost (together with hyperparameter tuning) could be beneficial for training the baselines as well. The results would be more convincing if the comparison is performed under similar compute cost, e.g. using half as many epochs as the baseline. Though the paper claims state-of-the-art performance, it is largely due to a well-tuned baseline setting with autoaugment, large number (700) of epochs, cosine LR decay (only mentioned in supplementary, not clear if used in the baseline), etc.
Neural Information Processing Systems
Feb-6-2025, 21:47:34 GMT
- Technology: