Goto

Collaborating Authors

 Generative AI


Evidential Softmax for Sparse Multimodal Distributions in Deep Generative Models

Neural Information Processing Systems

Many applications of generative models rely on the marginalization of their high-dimensional output probability distributions. Normalization functions that yield sparse probability distributions can make exact marginalization more computationally tractable. However, sparse normalization functions usually require alternative loss functions for training since the log-likelihood is undefined for sparse probability distributions. In this work, we present ev-softmax, a sparse normalization function that preserves the multimodality of probability distributions. We derive its properties, including its gradient in closed-form, and introduce a continuous family of approximations to ev-softmax that have full support and can be trained with probabilistic loss functions such as negative log-likelihood and Kullback-Leibler divergence.


Multi-objects Generation with Amortized Structural Regularization

Neural Information Processing Systems

Deep generative models (DGMs) have shown promise in image generation. However, most of the existing methods learn a model by simply optimizing a divergence between the marginal distributions of the model and the data, and often fail to capture rich structures, such as attributes of objects and their relationships, in an image. Human knowledge is a crucial element to the success of DGMs to infer these structures, especially in unsupervised learning. In this paper, we propose amortized structural regularization (ASR), which adopts posterior regularization (PR) to embed human knowledge into DGMs via a set of structural constraints. We derive a lower bound of the regularized log-likelihood in PR and adopt the amortized inference technique to jointly optimize the generative model and an auxiliary recognition model for inference efficiently. Empirical results show that ASR outperforms the DGM baselines in terms of inference performance and sample quality.


Sample-Efficient Optimization in the Latent Space of Deep Generative Models via Weighted Retraining

Neural Information Processing Systems

Many important problems in science and engineering, such as drug design, involve optimizing an expensive black-box objective function over a complex, high-dimensional, and structured input space. Although machine learning techniques have shown promise in solving such problems, existing approaches substantially lack sample efficiency. We introduce an improved method for efficient black-box optimization, which performs the optimization in the low-dimensional, continuous latent manifold learned by a deep generative model. In contrast to previous approaches, we actively steer the generative model to maintain a latent manifold that is highly useful for efficiently optimizing the objective. We achieve this by periodically retraining the generative model on the data points queried along the optimization trajectory, as well as weighting those data points according to their objective function value.


VAEM: a Deep Generative Model for Heterogeneous Mixed Type Data

Neural Information Processing Systems

Deep generative models often perform poorly in real-world applications due to the heterogeneity of natural data sets. Heterogeneity arises from data containing different types of features (categorical, ordinal, continuous, etc.) and features of the same type having different marginal distributions. We propose an extension of variational autoencoders (VAEs) called VAEM to handle such heterogeneous data. VAEM is a deep generative model that is trained in a two stage manner, such that the first stage provides a more uniform representation of the data to the second stage, thereby sidestepping the problems caused by heterogeneous data. We provide extensions of VAEM to handle partially observed data, and demonstrate its performance in data generation, missing data prediction and sequential feature selection tasks.


Bridging Explicit and Implicit Deep Generative Models via Neural Stein Estimators

Neural Information Processing Systems

There are two types of deep generative models: explicit and implicit. The former defines an explicit density form that allows likelihood inference; while the latter targets a flexible transformation from random noise to generated samples. While the two classes of generative models have shown great power in many applications, both of them, when used alone, suffer from respective limitations and drawbacks. To take full advantages of both models and enable mutual compensation, we propose a novel joint training framework that bridges an explicit (unnormalized) density estimator and an implicit sample generator via Stein discrepancy. We show that our method 1) induces novel mutual regularization via kernel Sobolev norm penalization and Moreau-Yosida regularization, and 2) stabilizes the training dynamics.


TikTok owner ByteDance unveils its first earbuds in China

Engadget

ByteDance is pushing further into hardware with its first set of earbuds. The TikTok owner's 170 Ola Friend buds will only be available in China for now. They link to the company's generative AI assistant, Doubao, which can be activated with a voice command. According to the South China Morning Post, wearers will be able to use the open-ear buds as a tour guide while traveling and to practice English (among other things) after they're paired with the Doubao mobile app. The publication notes that Doubao is currently the most popular GAI app in China, with more than 47 million monthly active users in September.


ODE2VAE: Deep generative second order ODEs with Bayesian neural networks

Neural Information Processing Systems

Leveraging the advances in deep generative models, ODE2VAE can simultaneously learn the embedding of high dimensional trajectories and infer arbitrarily complex continuous-time latent dynamics. Our model explicitly decomposes the latent space into momentum and position components and solves a second order ODE system, which is in contrast to recurrent neural network (RNN) based time series models and recently proposed black-box ODE techniques. In order to account for uncertainty, we propose probabilistic latent ODE dynamics parameterized by deep Bayesian neural networks. We demonstrate our approach on motion capture, image rotation, and bouncing balls datasets. We achieve state-of-the-art performance in long term motion prediction and imputation tasks.


The AI Revolution Is Coming for Your Non-Union Job

TIME - Tech

During this election cycle, we've heard a lot from the presidential candidates about the struggles of America's workers and their families. Kamala Harris and Donald Trump each want to claim the mantle as the country's pro-worker candidate. Accordingly, union leaders took the stage not only at the Democratic National Convention, as usual, but at the Republican convention too. At the VP debate, J.D. Vance and Tim Walz offered competing views on how best to support workers. Surprisingly, one economic issue the candidates have yet to address is one in which millions of voters have a great deal at stake: the looming impact of new generative artificial intelligence (GenAI) technologies on work and livelihoods. The candidates' silence belies a stark reality: the next president will take office in a world already changed by GenAI--and heading for much greater disruption.


IMPRESS: Evaluating the Resilience of Imperceptible Perturbations Against Unauthorized Data Usage in Diffusion-Based Generative AI

Neural Information Processing Systems

Diffusion-based image generation models, such as Stable Diffusion or DALL·E 2, are able to learn from given images and generate high-quality samples following the guidance from prompts. For instance, they can be used to create artistic images that mimic the style of an artist based on his/her original artworks or to maliciously edit the original images for fake content. However, such ability also brings serious ethical issues without proper authorization from the owner of the original images. In response, several attempts have been made to protect the original images from such unauthorized data usage by adding imperceptible perturbations, which are designed to mislead the diffusion model and make it unable to properly generate new samples. In this work, we introduce a perturbation purification platform, named IMPRESS, to evaluate the effectiveness of imperceptible perturbations as a protective measure.IMPRESS is based on the key observation that imperceptible perturbations could lead to a perceptible inconsistency between the original image and the diffusion-reconstructed image, which can be used to devise a new optimization strategy for purifying the image, which may weaken the protection of the original image from unauthorized data usage (e.g., style mimicking, malicious editing).The proposed IMPRESS platform offers a comprehensive evaluation of several contemporary protection methods, and can be used as an evaluation platform for future protection methods.


Further Analysis of Outlier Detection with Deep Generative Models

Neural Information Processing Systems

The recent, counter-intuitive discovery that deep generative models (DGMs) can frequently assign a higher likelihood to outliers has implications for both outlier detection applications as well as our overall understanding of generative modeling. In this work, we present a possible explanation for this phenomenon, starting from the observation that a model's typical set and high-density region may not conincide. From this vantage point we propose a novel outlier test, the empirical success of which suggests that the failure of existing likelihood-based outlier tests does not necessarily imply that the corresponding generative model is uncalibrated. We also conduct additional experiments to help disentangle the impact of low-level texture versus high-level semantics in differentiating outliers. In aggregate, these results suggest that modifications to the standard evaluation practices and benchmarks commonly applied in the literature are needed.