The success of deep learning over the last decade, particularly in computer vision, has depended greatly on large training data sets. Even though progress in this area boosted the performance of many tasks such as object detection, recognition, and segmentation, the main bottleneck for future improvement is more labeled data. Self-supervised learning is among the best alternatives for learning useful representations from the data. In this article, we will briefly review the self-supervised learning methods in the literature and discuss the findings of a recent self-supervised learning paper from ICLR 2020 . We may assume that most learning problems can be tackled by having clean labeling and more data obtained in an unsupervised way.
Here are the most tweeted papers that were uploaded onto arXiv during July 2020. Results are powered by Arxiv Sanity Preserver. Abstract: Massive language models are the core of modern NLP modeling and have been shown to encode impressive amounts of commonsense and factual information. However, that knowledge exists only within the latent parameters of the model, inaccessible to inspection and interpretation, and even worse, factual information memorized from the training corpora is likely to become stale as the world changes. Knowledge stored as parameters will also inevitably exhibit all of the biases inherent in the source materials.
Minirhizotron technology is widely used for studying the development of roots. Such systems collect visible-wavelength color imagery of plant roots in-situ by scanning an imaging system within a clear tube driven into the soil. Automated analysis of root systems could facilitate new scientific discoveries that would be critical to address the world's pressing food, resource, and climate issues.A key component of automated analysis of plant roots from imagery is the automated pixel-level segmentation of roots from their surrounding soil. Supervised learning techniques appear to be an appropriate tool for the challenge due to varying local soil and root conditions, however, lack of enough annotated training data is a major limitation due to the error-prone and time-consuming manually labeling process. In this paper, we investigate the use of deep neural networks based on the U-net architecture for automated, precise pixel-wise root segmentation in minirhizotron imagery.
Data scarcity is a perennial problem when applying deep learning (DL) to medical imaging. In vision tasks related to natural images, DL practitioners often have access to astoundingly large annotated data sets on which they can train. However, due to privacy concerns and the expense of creating them, access to large annotated data sets is rare in medical imaging. The natural follow-up question is: How can practitioners in the field of medical imaging best use DL given limited data? In this article, I'll discuss one approach to stretch the use of available data, called self-supervised learning.
Learning a sequence of tasks is a long-standing challenge in machine learning. This setting applies to learning systems that observe examples of a range of tasks at different points in time. A learning system should become more knowledgeable as more related tasks are learned. Although the problem of learning sequentially was acknowledged for the first time decades ago, the research in this area has been rather limited. Research in transfer learning, multitask learning, metalearning and deep learning has studied some challenges of these kinds of systems. Recent research in lifelong machine learning and continual learning has revived interest in this problem. We propose Proficiente, a full framework for long-term learning systems. Proficiente relies on knowledge transferred between hypotheses learned with Support Vector Machines. The first component of the framework is focused on transferring forward selectively from a set of existing hypotheses or functions representing knowledge acquired during previous tasks to a new target task. A second component of Proficiente is focused on transferring backward, a novel ability of long-term learning systems that aim to exploit knowledge derived from recent tasks to encourage refinement of existing knowledge. We propose a method that transfers selectively from a task learned recently to existing hypotheses representing previous tasks. The method encourages retention of existing knowledge whilst refining. We analyse the theoretical properties of the proposed framework. Proficiente is accompanied by an agnostic metric that can be used to determine if a long-term learning system is becoming more knowledgeable. We evaluate Proficiente in both synthetic and real-world datasets, and demonstrate scenarios where knowledgeable supervised learning systems can be achieved by means of transfer.
Self-supervised learning could lead to the creation of AI that's more human-like in its reasoning, according to Turing Award winners Yoshua Bengio and Yann LeCun. Bengio, director at the Montreal Institute for Learning Algorithms, and LeCun, Facebook VP and chief AI scientist, spoke candidly about this and other research trends during a session at the International Conference on Learning Representation (ICLR) 2020, which took place online. Supervised learning entails training an AI model on a labeled data set, and LeCun thinks it'll play a diminishing role as self-supervised learning comes into wider use. Instead of relying on annotations, self-supervised learning algorithms generate labels from data by exposing relationships among the data's parts, a step believed to be critical to achieving human-level intelligence. "Most of what we learn as humans and most of what animals learn is in a self-supervised mode, not a reinforcement mode. It's basically observing the world and interacting with it a little bit, mostly by observation in a test-independent way," said LeCun.
Cross entropy is the most widely used loss function for supervised training of image classification models. In this paper, we propose a novel training methodology that consistently outperforms cross entropy on supervised learning tasks across different architectures and data augmentations. We modify the batch contrastive loss, which has recently been shown to be very effective at learning powerful representations in the self-supervised setting. We are thus able to leverage label information more effectively than cross entropy. Clusters of points belonging to the same class are pulled together in embedding space, while simultaneously pushing apart clusters of samples from different classes. In addition to this, we leverage key ingredients such as large batch sizes and normalized embeddings, which have been shown to benefit self-supervised learning. On both ResNet-50 and ResNet-200, we outperform cross entropy by over 1%, setting a new state of the art number of 78.8% among methods that use AutoAugment data augmentation. The loss also shows clear benefits for robustness to natural corruptions on standard benchmarks on both calibration and accuracy. Compared to cross entropy, our supervised contrastive loss is more stable to hyperparameter settings such as optimizers or data augmentations.
Self-supervised learning is one of those recent ML methods that have caused a ripple effect in the data science network, yet have so far been flying under the radar to the extent Entrepreneurs and Fortunes of the world go; the overall population is yet to find out about the idea yet lots of AI society consider it progressive. The paradigm holds immense potential for enterprises too as it can help handle deep learning's most overwhelming issue: data/sample inefficiency and subsequent costly training. Yann LeCun said that if knowledge was a cake, unsupervised learning would be the cake, supervised learning would be the icing on the cake and reinforcement learning would be the cherry on the cake. We realize how to make the icing and the cherry, however, we don't have a clue how to make the cake." Unsupervised learning won't progress a lot and said there is by all accounts a massive conceptual disconnect with regards to how precisely it should function and that it was the dark issue of ...
One of the greatest obstacles in the adoption of deep neural networks for new applications is that training the network typically requires a large number of manually labeled training samples. We empirically investigate the scenario where one has access to large amounts of unlabeled data but require labeling only a single prototypical sample per class in order to train a deep network (i.e., one-shot semi-supervised learning). Specifically, we investigate the recent results reported in FixMatch for one-shot semi-supervised learning to understand the factors that affect and impede high accuracies and reliability for one-shot semi-supervised learning of Cifar-10. For example, we discover that one barrier to one-shot semi-supervised learning for high-performance image classification is the unevenness of class accuracy during the training. These results point to solutions that might enable more widespread adoption of one-shot semi-supervised training methods for new applications.
Machine learning algorithms, however effective, are known to be vulnerable in adversarial scenarios where a malicious user may inject manipulated instances. In this work we focus on evasion attacks, where a model is trained in a safe environment and exposed to attacks at test time. The attacker aims at finding a minimal perturbation of a test instance that changes the model outcome. We propose a model-agnostic strategy that builds a robust ensemble by training its basic models on feature-based partitions of the given dataset. Our algorithm guarantees that the majority of the models in the ensemble cannot be affected by the attacker. We experimented the proposed strategy on decision tree ensembles, and we also propose an approximate certification method for tree ensembles that efficiently assess the minimal accuracy of a forest on a given dataset avoiding the costly computation of evasion attacks. Experimental evaluation on publicly available datasets shows that proposed strategy outperforms state-of-the-art adversarial learning algorithms against evasion attacks.