We find that the majority of code learning research focuses on improving performance from a model-centric standpoint. This means that a dataset remains constant while new models are proposed to achieve better performance on such dataset. Improving a dataset can have various implications. One can identify and exclude noisy labels from such dataset, or if we know what the correct labels should be, we can correct them. By identifying noisy samples, we hope to improve the dataset's quality, thus improving the source code model's performance.
Self-Supervised Learning has become an exciting direction in AI community. Predicting What You Already Know Helps: Provable Self-Supervised Learning. For self-supervised learning, Rationality implies generalization, provably. Can Pretext-Based Self-Supervised Learning Be Boosted by Downstream Data? FAIR Self-Supervision Benchmark [pdf] [repo]: various benchmark (and legacy) tasks for evaluating quality of visual representations learned by various self-supervision approaches.
Neural Network (NN) is a black box for so many people. We know that it works, but we don't understand how it works. This article will demystify this belief by working on some examples to show how a neural network really works. If some terms are not so clear in this article, I have already written two more articles to cover the real basics: article 1 and article 2. In this first example, let us consider a simple case where we have a dataset of 3 features, a target variable and just one training example (in reality we can never have this kind of data, but, it will make a very good start). Fact 1: The structure of the data influences the architecture chosen for modelling.
In the past decade, the research and development in AI have skyrocketed, especially after the results of the ImageNet competition in 2012. The focus was largely on supervised learning methods that require huge amounts of labeled data to train systems for specific use cases. In this article, we will explore Self Supervised Learning (SSL) – a hot research topic in a machine learning community. Self-supervised learning (SSL) is an evolving machine learning technique poised to solve the challenges posed by the over-dependence of labeled data. For many years, building intelligent systems using machine learning methods has been largely dependent on good quality labeled data. Consequently, the cost of high-quality annotated data is a major bottleneck in the overall training process.
Batch Normalization (BN or BatchNorm) is a technique used to normalize the layer inputs by re-centering and re-scaling. This is done by evaluating the mean and the standard deviation of each input channel (across the whole batch), then normalizing these inputs (check this video) and, finally, both a scaling and a shifting take place through two learnable parameters β and γ. Batch Normalization is quite effective but the real reasons behind this effectiveness remain unclear. Initially, as it was proposed by Sergey Ioffe and Christian Szegedy in their 2015 article, the purpose of BN was to mitigate the internal covariate shift (ICS), defined as "the change in the distribution of network activations due to the change in network parameters during training". In fact, a reason to scale inputs is to get stable training; unfortunately this may be true in the beginning but as the network trains and the weights move away from their initial values there is no guarantee of stability.
Nowadays, Machine Learning and Deep Learning methods have become the state-of-the-art approach to solve data classification tasks. In order to use those methods, it is necessary to acquire and label a considerable amount of data; however, this is not straightforward in some fields, since data annotation is time consuming and might require expert knowledge. This challenge can be tackled by means of semi-supervised learning methods that take advantage of both labelled and unlabelled data. In this work, we present new semi-supervised learning methods based on techniques from Topological Data Analysis (TDA), a field that is gaining importance for analysing large amounts of data with high variety and dimensionality. In particular, we have created two semi-supervised learning methods following two different topological approaches.
After presenting SimCLR, a contrastive self-supervised learning framework, I decided to demonstrate another infamous method, called BYOL. Bootstrap Your Own Latent (BYOL), is a new algorithm for self-supervised learning of image representations. It does not explicitly use negative samples. Negative samples are images from the batch other than the positive pair. As a result, BYOL is claimed to require smaller batch sizes, which makes it an attractive choice.
The winners of the 2022 International Conference on Learning Representations (ICLR) outstanding paper awards have been announced. There are seven outstanding paper winners and three honourable mentions. The award winners will be presenting their work at the conference, which is taking place virtually, this week. Analytic-DPM: an analytic estimate of the optimal reverse variance in diffusion probabilistic models Fan Bao, Chongxuan Li, Jun Zhu, Bo Zhang Abstract: Diffusion probabilistic models (DPMs) represent a class of powerful generative models. Despite their success, the inference of DPMs is expensive since it generally needs to iterate over thousands of timesteps.
I recently started a new newsletter focus on AI education and already has over 50,000 subscribers. TheSequence is a no-BS( meaning no hype, no news etc) AI-focused newsletter that takes 5 minutes to read. The goal is to keep you up to date with machine learning projects, research papers and concepts. Most of machine learning(ML) papers we see these days are focused on advancing new techniques and methods in different areas such as natural language or computer vision. ML research is advancing at a frantic pace despite the lack of fundamental theory of machine intelligence.
In this hands-on tutorial, we will provide you with a reimplementation of SimCLR self-supervised learning method for pretraining robust feature extractors. This method is fairly general and can be applied to any vision dataset, as well as different downstream tasks. In a previous tutorial, I wrote a bit of a background on the self-supervised learning arena. Time to get into your first project by running SimCLR on a small dataset with 100K unlabelled images called STL10. Code is available on Github.