Unsupervised or Indirectly Supervised Learning
A reality check on the role of machine learning in cybersecurity
This article is part of Demystifying AI, a series of posts that (try to) disambiguate the jargon and myths surrounding AI. Cybersecurity, a huge industry worth over $100 billion, is regularly subject to buzzwords. Cybersecurity companies often (pretend) to use new state-of-the-art technologies to attract customers and sell their solutions. Naturally, with artificial intelligence being in one of its craziest hype cycles, we're seeing plenty of solutions that claim to use machine learning, deep learning and other AI-related technologies to automatically secure the networks and digital assets of their clients. But contrary to what many companies profess, machine learning is not a silver bullet that will automatically protect individuals and organizations against security threats, says Ilia Kolochenko, CEO of ImmuniWeb, a company that uses AI to test the security of web and mobile applications.
Beyond Clustering: The New Methods that are Pushing the Future of Unsupervised Learning
If you ask any group of data science students about the types of machine learning algorithms, they will answer without hesitation: supervised and unsupervised. However, if we ask that same group to list different types of unsupervised learning, we are likely to get an answer like clustering but not much more. While supervised methods lead the current wave of innovation in areas such as deep learning, there is very little doubt that the future of artificial intelligence(AI) will transition towards more unsupervised forms of learning. In recent years, we have seen a lot of progress on several new forms of unsupervised learning methods that expand way beyond traditional clustering or principal component analysis(PCA) techniques. Today, I would like to explore some of the most prominent new schools of thought in the unsupervised space and their role in the future of AI.
How To Build A Generative Adversarial Network In 8 Simple Steps
In a field like Computer Vision, which has been explored and studied for long, Generative Adversarial Network (GAN) was a recent addition which instantly became a new standard for training machines. GAN is an architecture developed by Ian Goodfellow and his colleagues in 2014 which makes use of multiple neural networks that compete against each other to make better predictions. Generator, the network that is responsible for generating new data from training data, and Discriminator, the one that identifies and distinguishes a generated image/fake image from an original image of the training set together form a GAN. Both these networks learn based on their previous predictions, competing with each other for a better outcome. In this article we will break down a simple GAN made with Keras into 8 simple steps.
Local Unsupervised Learning for Image Analysis
Grinberg, Leopold, Hopfield, John, Krotov, Dmitry
Local Hebbian learning is believed to be inferior in performance to end-to-end training using a backpropagation algorithm. We question this popular belief by designing a local algorithm that can learn convolutional filters at scale on large image datasets. These filters combined with patch normalization and very steep non-linearities result in a good classification accuracy for shallow networks trained locally, as opposed to end-to-end. The filters learned by our algorithm contain both orientation selective units and unoriented color units, resembling the responses of pyramidal neurons located in the cytochrome oxidase 'interblob' and 'blob' regions in the primary visual cortex of primates. It is shown that convolutional networks with patch normalization significantly outperform standard convolutional networks on the task of recovering the original classes when shadows are superimposed on top of standard CIFAR-10 images. Patch normalization approximates the retinal adaptation to the mean light intensity, important for human vision. We also demonstrate a successful transfer of learned representations between CIFAR-10 and ImageNet 32x32 datasets. All these results taken together hint at the possibility that local unsupervised training might be a powerful tool for learning general representations (without specifying the task) directly from unlabeled data.
Overview of feature selection methods
Selecting the right set of features to be used for data modelling has been shown to improve the performance of supervised and unsupervised learning, to reduce computational costs such as training time or required resources, in the case of high-dimensional input data to mitigate the curse of dimensionality. Computing and using feature importance scores is also an important step towards model interpret-ability. This post shares the overview of supervised and unsupervised methods for performing feature selection I have acquired after researching the topic for a few days. For all depicted methods I also provide references to open-source python implementations I used in order to allow you to quickly test out the presented algorithms. However, this research domain is very abundant in terms of methods which have been proposed during the last 2 decades and as such this post only attempts to present my current limited view without any pretense for completeness.
Unsupervised Learning: The Next Wave in AI Revolution Analytics Insight
Throughout the last decade, machine learning has gained exceptional ground in areas as varied as image recognition, self-driving vehicles and playing complex games like Go. These victories have been generally acknowledged via preparing deep neural systems with one of two learning paradigms which are supervised learning and reinforcement learning. The two standards require training signals to be structured by a human and then passed to the computer. On account of supervised learning, these are the "objectives, (for example, the right name for a picture); on account of reinforcement learning, they are the "rewards" for fruitful conduct, (for example, getting a high score in an Atari game). The cutoff points of learning are in this way characterized by human mentors.
Large-Scale Sparse Subspace Clustering Using Landmarks
Subspace clustering methods based on expressing each data point as a linear combination of all other points in a dataset are popular unsupervised learning techniques. However, existing methods incur high computational complexity on large-scale datasets as they require solving an expensive optimization problem and performing spectral clustering on large affinity matrices. This paper presents an efficient approach to subspace clustering by selecting a small subset of the input data called landmarks. The resulting subspace clustering method in the reduced domain runs in linear time with respect to the size of the original data. Numerical experiments on synthetic and real data demonstrate the effectiveness of our method.
CyberPoint ยท Blog ยท Learning in the Dark: Lessons Learned in Unsupervised Learning
CyberPoint has seen great success in using supervised machine learning for malware detection. A while back, however, some colleagues and I set out to investigate whether we could make any interesting discoveries by applying unsupervised learning to CyberPoint's malware dataset. In supervised learning, one has a set of samples, each with an assigned label. In the field of malware analysis, a sample would typically be a file, and its label might be either benign or the malware family to which it belongs. The goal is: given a new sample, correctly predict its label.
Discriminative Consistent Domain Generation for Semi-supervised Learning
Chen, Jun, Zhang, Heye, Zhang, Yanping, Zhao, Shu, Mohiaddin, Raad, Wong, Tom, Firmin, David, Yang, Guang, Keegan, Jennifer
Deep learning based task systems normally rely on a large amount of manually labeled training data, which is expensive to obtain and subject to operator variations. Moreover, it does not always hold that the manually labeled data and the unlabeled data are sitting in the same distribution. In this paper, we alleviate these problems by proposing a discriminative consistent domain generation (DCDG) approach to achieve a semi-supervised learning. The discriminative consistent domain is achieved by a double-sided domain adaptation. The double-sided domain adaptation aims to make a fusion of the feature spaces of labeled data and unlabeled data. In this way, we can fit the differences of various distributions between labeled data and unlabeled data. In order to keep the discriminativeness of generated consistent domain for the task learning, we apply an indirect learning for the double-sided domain adaptation. Based on the generated discriminative consistent domain, we can use the unlabeled data to learn the task model along with the labeled data via a consistent image generation. We demonstrate the performance of our proposed DCDG on the late gadolinium enhancement cardiac MRI (LGE-CMRI) images acquired from patients with atrial fibrillation in two clinical centers for the segmentation of the left atrium anatomy (LA) and proximal pulmonary veins (PVs). The experiments show that our semi-supervised approach achieves compelling segmentation results, which can prove the robustness of DCDG for the semi-supervised learning using the unlabeled data along with labeled data acquired from a single center or multicenter studies.
Semi-Supervised Learning by Disentangling and Self-Ensembling Over Stochastic Latent Space
Gyawali, Prashnna Kumar, Li, Zhiyuan, Ghimire, Sandesh, Wang, Linwei
The success of deep learning in medical imaging is mostly achieved at the cost of a large labeled data set. Semi-supervised learning (SSL) provides a promising solution by leveraging the structure of unlabeled data to improve learning from a small set of labeled data. Self-ensembling is a simple approach used in SSL to encourage consensus among ensemble predictions of unknown labels, improving generalization of the model by making it more insensitive to the latent space. Currently, such an ensemble is obtained by randomization such as dropout regularization and random data augmentation. In this work, we hypothesize -- from the generalization perspective -- that self-ensembling can be improved by exploiting the stochasticity of a disentangled latent space. To this end, we present a stacked SSL model that utilizes unsupervised disentangled representation learning as the stochastic embedding for self-ensembling. We evaluate the presented model for multi-label classification using chest X-ray images, demonstrating its improved performance over related SSL models as well as the interpretability of its disentangled representations.