Goto

Collaborating Authors

Zhang, Han


Modeling Heterogeneous Statistical Patterns in High-dimensional Data by Adversarial Distributions: An Unsupervised Generative Framework

arXiv.org Artificial Intelligence

Since the label collecting is prohibitive and time-consuming, unsupervised methods are preferred in applications such as fraud detection. Meanwhile, such applications usually require modeling the intrinsic clusters in high-dimensional data, which usually displays heterogeneous statistical patterns as the patterns of different clusters may appear in different dimensions. Existing methods propose to model the data clusters on selected dimensions, yet globally omitting any dimension may damage the pattern of certain clusters. To address the above issues, we propose a novel unsupervised generative framework called FIRD, which utilizes adversarial distributions to fit and disentangle the heterogeneous statistical patterns. When applying to discrete spaces, FIRD effectively distinguishes the synchronized fraudsters from normal users. Besides, FIRD also provides superior performance on anomaly detection datasets compared with SOTA anomaly detection methods (over 5% average AUC improvement). The significant experiment results on various datasets verify that the proposed method can better model the heterogeneous statistical patterns in high-dimensional data and benefit downstream applications.


From Spectrum Wavelet to Vertex Propagation: Graph Convolutional Networks Based on Taylor Approximation

arXiv.org Artificial Intelligence

Graph convolutional networks (GCN) have been recently applied to semi-supervised classification problems with fewer labeled data and higher-dimensional features. Existing GCNs mostly rely on a first-order Chebyshev approximation of the graph wavelet-kernels. Such a generic propagation model may not always be well suited for the datasets. This work revisits the fundamentals of graph wavelet and explores the utility of spectral wavelet-kernels to signal propagation in the vertex domain. We first derive the conditions for representing the graph wavelet-kernels via vertex propagation. We next propose alternative propagation models for GCN layers based on Taylor expansions. We further analyze the choices of detailed propagation models. We test the proposed Taylor-based GCN (TGCN) in citation networks and 3D point clouds to demonstrate its advantages over traditional GCN methods.


A Hybrid Evolutionary Algorithm for Reliable Facility Location Problem

arXiv.org Artificial Intelligence

The reliable facility location problem (RFLP) is an important research topic of operational research and plays a vital role in the decision-making and management of modern supply chain and logistics. Through solving RFLP, the decision-maker can obtain reliable location decisions under the risk of facilities' disruptions or failures. In this paper, we propose a novel model for the RFLP. Instead of assuming allocating a fixed number of facilities to each customer as in the existing works, we set the number of allocated facilities as an independent variable in our proposed model, which makes our model more close to the scenarios in real life but more difficult to be solved by traditional methods. To handle it, we propose EAMLS, a hybrid evolutionary algorithm, which combines a memorable local search (MLS) method and an evolutionary algorithm (EA). Additionally, a novel metric called l3-value is proposed to assist the analysis of the algorithm's convergence speed and exam the process of evolution. The experimental results show the effectiveness and superior performance of our EAMLS, compared to a CPLEX solver and a Genetic Algorithm (GA), on large-scale problems.


Image Augmentations for GAN Training

arXiv.org Machine Learning

Data augmentations have been widely studied to improve the accuracy and robustness of classifiers. However, the potential of image augmentation in improving GAN models for image synthesis has not been thoroughly investigated in previous studies. In this work, we systematically study the effectiveness of various existing augmentation techniques for GAN training in a variety of settings. We provide insights and guidelines on how to augment images for both vanilla GANs and GANs with regularizations, improving the fidelity of the generated images substantially. Surprisingly, we find that vanilla GANs attain generation quality on par with recent state-of-the-art results if we use augmentations on both real and generated images. When this GAN training is combined with other augmentation-based regularization techniques, such as contrastive loss and consistency regularization, the augmentations further improve the quality of generated images. We provide new state-of-the-art results for conditional generation on CIFAR-10 with both consistency loss and contrastive loss as additional regularizations.


Improved Consistency Regularization for GANs

arXiv.org Machine Learning

Recent work has increased the performance of Generative Adversarial Networks (GANs) by enforcing a consistency cost on the discriminator. We improve on this technique in several ways. We first show that consistency regularization can introduce artifacts into the GAN samples and explain how to fix this issue. We then propose several modifications to the consistency regularization procedure designed to improve its performance. We carry out extensive experiments quantifying the benefit of our improvements. For unconditional image synthesis on CIFAR-10 and CelebA, our modifications yield the best known FID scores on various GAN architectures. For conditional image synthesis on CIFAR-10, we improve the state-of-the-art FID score from 11.48 to 9.21. Finally, on ImageNet-2012, we apply our technique to the original BigGAN model and improve the FID from 6.66 to 5.38, which is the best score at that model size.


FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence

arXiv.org Machine Learning

Semi-supervised learning (SSL) provides an effective means of leveraging unlabeled data to improve a model's performance. In this paper, we demonstrate the power of a simple combination of two common SSL methods: consistency regularization and pseudo-labeling. Our algorithm, FixMatch, first generates pseudo-labels using the model's predictions on weakly-augmented unlabeled images. For a given image, the pseudo-label is only retained if the model produces a high-confidence prediction. The model is then trained to predict the pseudo-label when fed a strongly-augmented version of the same image. Despite its simplicity, we show that FixMatch achieves state-of-the-art performance across a variety of standard semi-supervised learning benchmarks, including 94.93% accuracy on CIFAR-10 with 250 labels and 88.61% accuracy with 40 -- just 4 labels per class. Since FixMatch bears many similarities to existing SSL methods that achieve worse performance, we carry out an extensive ablation study to tease apart the experimental factors that are most important to FixMatch's success. We make our code available at https://github.com/google-research/fixmatch.


MANELA: A Multi-Agent Algorithm for Learning Network Embeddings

arXiv.org Artificial Intelligence

--Playing an essential role in data mining, machine learning has a long history of being applied to networks on multifarious tasks and has played an essential role in data mining. However, the discrete and sparse natures of networks often render it difficult to apply machine learning directly to networks. T o circumvent this difficulty, one major school of thought to approach networks using machine learning is via network embeddings . On the one hand, this network embeddings have achieved huge success on aggregated network data in recent years. On the other hand, learning network embeddings on distributively stored networks still remained understudied: T o the best of our knowledge, all existing algorithms for learning network embeddings have hitherto been exclusively centralized and thus cannot be applied to these networks. T o accommodate distributively stored networks, in this paper, we proposed a multi-agent model. Under this model, we developed the multi-agent network embedding learning algorithm (MANELA) for learning network embeddings. We demonstrate MANELA's advantages over other existing centralized network embedding learning algorithms both theoretically and experimentally. I NTRODUCTION Playing an essential role in data mining, machine learning has a long history of being applied to networks on multifarious tasks, such as network classification [1], prediction of protein binding [2], etc. Thanks to the advancement of technologies such as the Internet and database management systems, the amount of data that are available for machine learning algorithms have been growing tremendously over the past decade. Among these datasets, a huge fraction can be modeled as networks, such as web networks, brain networks, citation networks, street networks, etc. [3]. Therefore, improving machine learning algorithms on networks has become even more important. However, the discrete and sparse natures of networks often render it difficult to apply machine learning directly to networks. To circumvent this difficulty, one major school of thought to approach networks using machine learning is via network embeddings [4].


Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for Generative Models

arXiv.org Machine Learning

We introduce a new local sparse attention layer that preserves two-dimensional geometry and locality. We show that by just replacing the dense attention layer of SAGAN with our construction, we obtain very significant FID, Inception score and pure visual improvements. FID score is improved from $18.65$ to $15.94$ on ImageNet, keeping all other parameters the same. The sparse attention patterns that we propose for our new layer are designed using a novel information theoretic criterion that uses information flow graphs. We also present a novel way to invert Generative Adversarial Networks with attention. Our method extracts from the attention layer of the discriminator a saliency map, which we use to construct a new loss function for the inversion. This allows us to visualize the newly introduced attention heads and show that they indeed capture interesting aspects of two-dimensional geometry of real images.


ReMixMatch: Semi-Supervised Learning with Distribution Alignment and Augmentation Anchoring

arXiv.org Machine Learning

A BSTRACT We improve the recently-proposed "MixMatch" semi-supervised learning algorithm by introducing two new techniques: distribution alignment and augmentation anchoring. Distribution alignment encourages the marginal distribution of predictions on unlabeled data to be close to the marginal distribution of ground-truth labels. Augmentation anchoring feeds multiple strongly augmented versions of an input into the model and encourages each output to be close to the prediction for a weakly-augmented version of the same input. To produce strong augmentations, we propose a variant of AutoAugment which learns the augmentation policy while the model is being trained. Our new algorithm, dubbed ReMix-Match, is significantly more data-efficient than prior work, requiring between 5 and 16 less data to reach the same accuracy. For example, on CIFAR-10 with 250 labeled examples we reach 93 .73% This can enable the use of large, powerful models when labeling data is expensive or inconvenient. Research on SSL has produced a diverse collection of approaches, including consistency regularization (Sajjadi et al., 2016; Laine & Aila, 2017) which encourages a model to produce the same prediction when the input is perturbed and entropy minimization (Grandvalet & Bengio, 2005) which encourages the model to output high-confidence predictions. The recently proposed "MixMatch" algorithm (Berthelot et al., 2019) combines these techniques in a unified loss function and achieves strong performance on a variety of image classification benchmarks.


Small-GAN: Speeding Up GAN Training Using Core-sets

arXiv.org Machine Learning

Recent work by Brock et al. (2018) suggests that Generative Adversarial Networks (GANs) benefit disproportionately from large mini-batch sizes. Unfortunately, using large batches is slow and expensive on conventional hardware. Thus, it would be nice if we could generate batches that were effectively large though actually small. In this work, we propose a method to do this, inspired by the use of Coreset-selection in active learning. When training a GAN, we draw a large batch of samples from the prior and then compress that batch using Coreset-selection. To create effectively large batches of 'real' images, we create a cached dataset of Inception activations of each training image, randomly project them down to a smaller dimension, and then use Coreset-selection on those projected activations at training time. We conduct experiments showing that this technique substantially reduces training time and memory usage for modern GAN variants, that it reduces the fraction of dropped modes in a synthetic dataset, and that it allows GANs to reach a new state of the art in anomaly detection.