Goto

Collaborating Authors

 adversarial approach


Provably Efficient Neural Estimation of Structural Equation Models: An Adversarial Approach

Neural Information Processing Systems

Structural equation models (SEMs) are widely used in sciences, ranging from economics to psychology, to uncover causal relationships underlying a complex system under consideration and estimate structural parameters of interest. We study estimation in a class of generalized SEMs where the object of interest is defined as the solution to a linear operator equation. We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using the stochastic gradient descent. We consider both 2-layer and multi-layer NNs with ReLU activation functions and prove global convergence in an overparametrized regime, where the number of neurons is diverging. The results are established using techniques from online learning and local linearization of NNs, and improve in several aspects the current state-of-the-art. For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.


Learning to Drop Out: An Adversarial Approach to Training Sequence VAEs

Neural Information Processing Systems

In principle, applying variational autoencoders (VAEs) to sequential data offers a method for controlled sequence generation, manipulation, and structured representation learning. However, training sequence VAEs is challenging: autoregressive decoders can often explain the data without utilizing the latent space, known as posterior collapse. To mitigate this, state-of-the-art models powerful decoder' by applying uniformly random dropout to the decoder input.We show theoretically that this removes pointwise mutual information provided by the decoder input, which is compensated for by utilizing the latent space. We then propose an adversarial training strategy to achieve information-based stochastic dropout. Compared to uniform dropout on standard text benchmark datasets, our targeted approach increases both sequence modeling performance and the information captured in the latent space.


MetaGAN: An Adversarial Approach to Few-Shot Learning

Neural Information Processing Systems

In this paper, we propose a conceptually simple and general framework called MetaGAN for few-shot learning problems. Most state-of-the-art few-shot classification models can be integrated with MetaGAN in a principled and straightforward way. By introducing an adversarial generator conditioned on tasks, we augment vanilla few-shot classification models with the ability to discriminate between real and fake data. We argue that this GAN-based approach can help few-shot classifiers to learn sharper decision boundary, which could generalize better. We show that with our MetaGAN framework, we can extend supervised few-shot learning models to naturally cope with unsupervised data. Different from previous work in semi-supervised few-shot learning, our algorithms can deal with semi-supervision at both sample-level and task-level. We give theoretical justifications of the strength of MetaGAN, and validate the effectiveness of MetaGAN on challenging few-shot image classification benchmarks.


Unsupervised Estimation of Nonlinear Audio Effects: Comparing Diffusion-Based and Adversarial approaches

Moliner, Eloi, Švento, Michal, Wright, Alec, Juvela, Lauri, Rajmic, Pavel, Välimäki, Vesa

arXiv.org Artificial Intelligence

Accurately estimating nonlinear audio effects without access to paired input-output signals remains a challenging problem. This work studies unsupervised probabilistic approaches for solving this task. We introduce a method, novel for this application, based on diffusion generative models for blind system identification, enabling the estimation of unknown nonlinear effects using black- and gray-box models. This study compares this method with a previously proposed adversarial approach, analyzing the performance of both methods under different parameterizations of the effect operator and varying lengths of available effected recordings. Through experiments on guitar distortion effects, we show that the diffusion-based approach provides more stable results and is less sensitive to data availability, while the adversarial approach is superior at estimating more pronounced distortion effects. Our findings contribute to the robust unsupervised blind estimation of audio effects, demonstrating the potential of diffusion models for system identification in music technology.


Bridging the Gap: Unifying the Training and Evaluation of Neural Network Binary Classifiers

Neural Information Processing Systems

How can this training-evaluation gap be addressed? While specific techniques have been adopted to optimize certain confusion matrix based metrics, it is challenging or impossible in some cases to generalize the techniques to other metrics.


Review for NeurIPS paper: Provably Efficient Neural Estimation of Structural Equation Models: An Adversarial Approach

Neural Information Processing Systems

Summary and Contributions: The paper proposes an adversarial minimax two player game approach for optimising the parameters of a generalised structural equation model (SEM) formulated as a saddle-point problem. The generalised SEM is defined in terms of a conditional expectation operator mapping between a hilbert space of structural functions of interest to a hilbert space of known or estimated functions of the outcome. These spaces are subsequently chosen to be the space of possible neural networks and a stochastic primal-dual algorithm is given for finding a solution to the saddle-point problem. Furthermore, the work proves global convergence of the algorithm. This main result is achieved, under certain specific data and weight initialisation conditions, using a regret analysis while considering the infinite width limit for neural networks that cause them to behave like linear learners.


Learning to Drop Out: An Adversarial Approach to Training Sequence VAEs

Neural Information Processing Systems

In principle, applying variational autoencoders (VAEs) to sequential data offers a method for controlled sequence generation, manipulation, and structured representation learning. However, training sequence VAEs is challenging: autoregressive decoders can often explain the data without utilizing the latent space, known as posterior collapse. To mitigate this, state-of-the-art models weaken' thepowerful decoder' by applying uniformly random dropout to the decoder input.We show theoretically that this removes pointwise mutual information provided by the decoder input, which is compensated for by utilizing the latent space. We then propose an adversarial training strategy to achieve information-based stochastic dropout. Compared to uniform dropout on standard text benchmark datasets, our targeted approach increases both sequence modeling performance and the information captured in the latent space.


Provably Efficient Neural Estimation of Structural Equation Models: An Adversarial Approach

Neural Information Processing Systems

Structural equation models (SEMs) are widely used in sciences, ranging from economics to psychology, to uncover causal relationships underlying a complex system under consideration and estimate structural parameters of interest. We study estimation in a class of generalized SEMs where the object of interest is defined as the solution to a linear operator equation. We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using the stochastic gradient descent. We consider both 2-layer and multi-layer NNs with ReLU activation functions and prove global convergence in an overparametrized regime, where the number of neurons is diverging. The results are established using techniques from online learning and local linearization of NNs, and improve in several aspects the current state-of-the-art.


Reviews: Adversarial Surrogate Losses for Ordinal Regression

Neural Information Processing Systems

The paper proposes an adversarial approach to ordinal regression, building upon recent works along these lines for cost-sensitive losses. The proposed method is shown to be consistent, and to have favourable empirical performance compared to existing methods. The basic idea of the paper is simple yet interesting: since ordinal regression can be viewed as a type of multiclass classification, and the latter has recently been attacked by adversarial learning approaches with some success, one can combine the two to derive adversarial ordinal regression approaches. By itself this would make the contribution a little narrow, but it is further shown that the adversarial loss in this particular problem admits a tractable form (Thm 1), which allows for efficient optimisation. Fisher-consistency of the approach also follows as a consequence of existing results for the cost-sensitive case, which is a salient feature of the approach.


Reviews: MetaGAN: An Adversarial Approach to Few-Shot Learning

Neural Information Processing Systems

This paper proposes a method of improving upon existing meta-learning approaches by augmenting the training with a GAN setup. The basic idea has been explored in the context of semi-supervised learning: add an additional class to the classifier's outputs and train the classifier/discriminator to classify generated data as this additional fake class. This paper extends the reasoning for why it might work for semi supervised learning to why is might work for few-shot meta learning. The clarity of this paper could be greatly improved. They are presenting many different variants of few-shot learning in supervised and semi-supervised setting, and the notation is a bit tricky to follow initially.