percent top-1 accuracy
New DeepMind Approach 'Bootstraps' Self-Supervised Learning of Image Representations
The Cambridge Dictionary defines "bootstrap" as: "to improve your situation or become more successful, without help from others or without advantages that others have." While a machine learning algorithm's strength depends heavily on the quality of data it is fed, an algorithm that can do the work required to improve itself should become even stronger. A team of researchers from DeepMind and Imperial College recently set out to prove that in the arena of computer vision. In the updated paper Bootstrap Your Own Latent – A New Approach to Self-Supervised Learning, the researchers release the source code and checkpoint for their new "BYOL" approach to self-supervised image representation learning along with new theoretical and experimental insights. In computer vision, learning good image representations is critical as it allows for efficient training on downstream tasks. Image representation learning basically leverages neural networks that have been trained to produce good representations.
Facebook & Inria Propose High-Performance Self-Supervised Technique for CV Tasks
Researchers from Facebook and the French National Institute for Research in Digital Science and Technology (Inria) have developed a new technique for self-supervised training of convolutional networks used for image classification and other computer vision tasks. The proposed method surpasses supervised techniques on most transfer tasks and outperforms previous self-supervised approaches. "Our approach allows researchers to train efficient, high-performance image classification models with no annotations or metadata," the researchers write in a Facebook blog post. "More broadly, we believe that self-supervised learning is key to building more flexible and useful AI." Recent improvements in self-supervised training methods have established them as a serious alternative to traditional supervised training. Self-supervised approaches however are significantly slower to train compared to their supervised counterparts.
Google Brain's SimCLRv2 Achieves New SOTA in Semi-Supervised Learning
Following on the February release of its contrastive learning framework SimCLR, the same team of Google Brain researchers guided by Turing Award honouree Dr. Geoffrey Hinton has presented SimCLRv2, an upgraded approach that boosts the SOTA results by 21.6 percent. The updated framework takes the "unsupervised pretrain, supervised fine-tune" paradigm popular in natural language processing and applies it to image recognition. Unlabelled data is learned in a task-agnostic way in the pretraining phase, which means the model has no prior classification knowledge. The researchers find that using a deep and wide neural network can be more label-efficient and greatly improve accuracy. Unlike SimCLR, whose largest model is ResNet-50, SimCLRv2's largest model is a 152-layer ResNet, which is three times wider in channels and selective kernels.
Billion-scale semi-supervised learning for state-of-the-art image and video classification
Accurate image and video classification is important for a wide range of computer vision applications, from identifying harmful content, to making products more accessible to the visually impaired, to helping people more easily buy and sell things on products like Marketplace. Facebook AI is developing alternative ways to train our AI systems so that we can do more with less labeled training data overall, and also deliver accurate results even when large, high-quality labeled data sets are simply not available. Today, we are sharing details on a versatile new model training technique that delivers state-of-the-art accuracy for image and video classification systems. This approach, which we call semi-weak supervision, is a new way to combine the merits of two different training methods: semi-supervised learning and weakly supervised learning. It opens the door the door to creating more accurate, efficient production classification models by using a teacher-student model training paradigm and billion-scale weakly supervised data sets.