Goto

Collaborating Authors

 byol




f3ada80d5c4ee70142b17b8192b2958e-Supplemental.pdf

Neural Information Processing Systems

First, a random patch of the image is selected and resized to224 224 with a random horizontal flip, followed byacolor distortion, consisting ofarandom sequence ofbrightness, contrast, saturation, hue adjustments, and anoptional grayscale conversion. FinallyGaussian blur and solarization are appliedtothepatches. Optimization We use theLARS optimizer [70] with a cosine decay learning rate schedule [71], without restarts, over1000epochs, with awarm-up period of10epochs. Wesetthebase learning rate to 0.2, scaled linearly [72] with the batch size (LearningRate = 0.2 BatchSize/256). Forthetargetnetwork,the exponential moving average parameterτ starts fromτbase = 0.996and is increased to one during training.





Collaborative Unlabeled Data Optimization

Shang, Xinyi, Sun, Peng, Liu, Fengyuan, Lin, Tao

arXiv.org Artificial Intelligence

This paper pioneers a novel data-centric paradigm to maximize the utility of unlabeled data, tackling a critical question: How can we enhance the efficiency and sustainability of deep learning training by optimizing the data itself? We begin by identifying three key limitations in existing model-centric approaches, all rooted in a shared bottleneck: knowledge extracted from data is locked to model parameters, hindering its reusability and scalability. To this end, we propose CoOpt, a highly efficient, parallelized framework for collaborative unlabeled data optimization, thereby effectively encoding knowledge into the data itself. By distributing unlabeled data and leveraging publicly available task-agnostic models, CoOpt facilitates scalable, reusable, and sustainable training pipelines. Extensive experiments across diverse datasets and architectures demonstrate its efficacy and efficiency, achieving 13.6% and 6.8% improvements on Tiny-ImageNet and ImageNet-1K, respectively, with training speedups of $1.94 \times $ and $1.2 \times$.


Unsupervised Object-Level Representation Learning from Scene Images

Neural Information Processing Systems

Our key insight is to leverage image-level self-supervised pre-training as the prior to discover object-level semantic correspondence, thus realizing object-level representation learning from scene images.


A ImageNet Texture

Neural Information Processing Systems

See Figures 7 and 8 for examples of the ImageNet-Texture dataset and their counterparts in the original ImageNet dataset. Shape is often less well-defined in these classes, for example in window screen and rapeseed. B.1 Comparison of two ways to apply α in NCE loss Since the denominator normalizes the 3 kinds of pairs equally, we only pay attention to the numerator. Because of the exponential tail, it applies a exponentially larger weight to the negatives that are harder. Our patch-based augmentation is also closely related to some of the self-supervised learning methods which solve jigsaw as the pretext task. All of our models are trained on 4 GTX 1080 Ti gpus.