Goto

Collaborating Authors

 Unsupervised or Indirectly Supervised Learning


Automate Data Cleaning with Unsupervised Learning

#artificialintelligence

I like working with textual data. As for Computer Vision, in NLP nowadays there are a lot of ready accessible resources and opensource projects, which we can directly download or consume. Some of them are realy cool and permit us to speed up and bring to another level our projects. The most important thing we must not forgotten is that all these instruments aren't magic. Some of them declare high performances but they are nothing if we don't allow them to make the best.


Types of Machine Learning in Finance. How to Apply Them

#artificialintelligence

Businesses and financial institutions are facing a profound challenge. They're gathering much more data, from their apps, social media, IoT sensors, etc., than they could possibly process and act upon. Lacking analytical capabilities, and trained staff, they can't help throwing significant value out the window and failing to monetize a significant asset at their disposal. This has been a problem ever since the advancement of the technology for generating data outpaced the tools for mining it. And for a few years now, it's been becoming worse.


Revisiting Self-Training for Neural Sequence Generation

arXiv.org Machine Learning

Self-training is one of the earliest and simplest semi-supervised methods. The key idea is to augment the original labeled dataset with unlabeled data paired with the model's prediction (i.e. pseudo-parallel data). While self-training has been extensively studied on classification problems, in complex sequence generation tasks (e.g. machine translation) it is still unclear how self-training works due to the compositionality of the target space. In this work, we first empirically show that self-training is able to decently improve the supervised baseline on neural sequence generation tasks. Through careful examination of the performance gains, we find that the perturbation on the hidden states (i.e. dropout) is critical for self-training to benefit from the pseudo-parallel data, which acts as a regularizer and forces the model to yield close predictions for similar unlabeled inputs. Such effect helps the model correct some incorrect predictions on unlabeled data. To further encourage this mechanism, we propose to inject noise to the input space, resulting in a "noisy" version of self-training. Empirical study on standard machine translation and text summarization benchmarks shows that noisy self-training is able to effectively utilize unlabeled data and improve the performance of the supervised baseline by a large margin.


Optimal Transport, CycleGAN, and Penalized LS for Unsupervised Learning in Inverse Problems

arXiv.org Machine Learning

O PTIMAL T RANSPORT, C YCLEGAN, AND P ENALIZED LS FOR U NSUPERVISEDL EARNING IN I NVERSE P ROB-LEMS Byeongsu Sim 1 Gyutaek Oh 2 Sungjun Lim 2 Jong Chul Y e 1,2 1 Department of Mathematical Sciences, KAIST, Daejeon, Republic of Korea 2 Department of Bio and Brain Engineering, KAIST, Daejeon, Republic of Korea A BSTRACT The penalized least squares (PLS) is a classic approach to inverse problems, where a regularization term is added to stabilize the solution. Optimal transport (OT) is another mathematical framework for computer vision tasks by providing means to transport one measure to another at minimal cost. Cycle-consistent generative adversarial network (cycleGAN) is a recent extension of GAN to learn target distributions with less mode collapsing behavior. Although similar in that no supervised training is required, the algorithms look different, so the mathematical relationship between these approaches is not clear. In this article, we provide an important advance to unveil the missing link. Specifically, we reveal that a cycle-GAN architecture can be derived as a dual formulation of the optimal transport problem, if the PLS with a deep learning penalty is used as a transport cost between the two probability measures from measurements and unknown images. This suggests that cycleGAN can be considered as stochastic generalization of classical PLS approaches. Our derivation is so general that various types of cy-cleGAN architecture can be easily derived by merely changing the transport cost.


Unsupervised Domain Adaptation through Self-Supervision

arXiv.org Machine Learning

This paper addresses unsupervised domain adaptation, the setting where labeled training data is available on a source domain, but the goal is to have good performance on a target domain with only unlabeled data. Like much of previous work, we seek to align the learned representations of the source and target domains while preserving discriminability. The way we accomplish alignment is by learning to perform auxiliary self-supervised task(s) on both domains simultaneously. Each self-supervised task brings the two domains closer together along the direction relevant to that task. Training this jointly with the main task classifier on the source domain is shown to successfully generalize to the unlabeled target domain. The presented objective is straightforward to implement and easy to optimize. We achieve state-of-the-art results on four out of seven standard benchmarks, and competitive results on segmentation adaptation. We also demonstrate that our method composes well with another popular pixel-level adaptation method. Visual distribution shifts are fundamental to our constantly evolving world. We humans face them all the time, e.g. when we navigate a foreign city, read text in a new font, or recognize objects in an environment we have never encountered before. These real-world challenges to the human visual perception have direct parallels in computer vision.


GAN Lab: Play with Generative Adversarial Networks in Your Browser!

#artificialintelligence

Darker green means that samples in that region are more likely to be real; darker purple, more likely to be fake. As a GAN approaches the optimum, the whole heatmap will become more gray overall, signalling that the discriminator can no longer easily distinguish fake examples from the real ones. In a GAN, its two networks influence each other as they iteratively update themselves. A great use for GAN Lab is to use its visualization to learn how the generator incrementally updates to improve itself to generate fake samples that are increasingly more realistic. The generator does it by trying to fool the discriminator.


PDE-Inspired Algorithms for Semi-Supervised Learning on Point Clouds

arXiv.org Machine Learning

Given a data set and a subset of labels the problem of semi-supervised learning on point clouds is to extend the labels to the entire data set. In this paper we extend the labels by minimising the constrained discrete $p$-Dirichlet energy. Under suitable conditions the discrete problem can be connected, in the large data limit, with the minimiser of a weighted continuum $p$-Dirichlet energy with the same constraints. We take advantage of this connection by designing numerical schemes that first estimate the density of the data and then apply PDE methods, such as pseudo-spectral methods, to solve the corresponding Euler-Lagrange equation. We prove that our scheme is consistent in the large data limit for two methods of density estimation: kernel density estimation and spline kernel density estimation.


Unsupervised data to content transformation with histogram-matching cycle-consistent generative adversarial networks

#artificialintelligence

The segmentation of images is a common task in a broad range of research fields. To tackle increasingly complex images, artificial intelligence-based approaches have emerged to overcome the shortcomings of traditional feature detection methods. Owing to the fact that most artificial intelligence research is made publicly accessible and programming the required algorithms is now possible in many popular languages, the use of such approaches is becoming widespread. However, these methods often require data labelled by the researcher to provide a training target for the algorithms to converge to the desired result. This labelling is a limiting factor in many cases and can become prohibitively time consuming. Inspired by the ability of cycle-consistent generative adversarial networks to perform style transfer, we outline a method whereby a computer-generated set of images is used to segment the true images.


Understanding and Improving Virtual Adversarial Training

arXiv.org Machine Learning

In semi-supervised learning, virtual adversarial training (VAT) approach is one of the most attractive method due to its intuitional simplicity and powerful performances. VAT finds a classifier which is robust to data perturbation toward the adversarial direction. In this study, we provide a fundamental explanation why VAT works well in semi-supervised learning case and propose new techniques which are simple but powerful to improve the VAT method. Especially we employ the idea of Bad GAN approach, which utilizes bad samples distributed on complement of the support of the input data, without any additional deep generative architectures. We generate bad samples of high-quality by use of the adversarial training used in VAT and also give theoretical explanations why the adversarial training is good at both generating bad samples. An advantage of our proposed method is to achieve the competitive performances compared with other recent studies with much fewer computations. We demonstrate advantages our method by various experiments with well known benchmark image datasets.


Unsupervised Learning Will Bring About The Next AI Revolution

#artificialintelligence

A6 months old baby won't even notice if a toy truck drives off a platform and seems to fly in the air. However, if the same experiment is repeated 2 to 3 months later, the baby will immediately identify that something is wrong. This means that the baby has already learned the concept of gravity. "Nobody tells a baby that objects are supposed to fall," said the chief AI scientist at Facebook and a professor at NYU, Dr. Yann LeCun, during a webinar organized by the Association for Computing Machinery, an industry body. Because babies do not have very sophisticated motor control, LeCun hypothesizes, "a lot of what they learn about the world is through observation."