There are multiple and even interacting dimensions along which shape representation schemes may be compared and contrasted. In this paper, we focus on the following ques- tion. Are the building blocks in a compositional model lo- calized in space (e.g. as in part based representations) or are they holistic simplifications (e.g. as in spectral representa- tions)? Existing shape representation schemes prefer one or the other. We propose a new shape representation paradigm that encompasses both choices.
Adversarially trained generative models (GANs) have recently achieved compelling image synthesis results. But despite early successes in using GANs for unsupervised representation learning, they have since been superseded by approaches based on self-supervision. In this work we show that progress in image generation quality translates to substantially improved representation learning performance. Our approach, BigBiGAN, builds upon the state-of-the-art BigGAN model, extending it to representation learning by adding an encoder and modifying the discriminator. We extensively evaluate the representation learning and generation capabilities of these BigBiGAN models, demonstrating that these generation-based models achieve the state of the art in unsupervised representation learning on ImageNet, as well as compelling results in unconditional image generation.
Compact representations of objects is a common concept in computer science. Automated planning can be viewed as a case of this concept: a planning instance is a compact implicit representation of a graph and the problem is to find a path (a plan) in this graph. While the graphs themselves are represented compactly as planning instances, the paths are usually represented explicitly as sequences of actions. Some cases are known where the plans always have compact representations, for example, using macros. We show that these results do not extend to the general case, by proving a number of bounds for compact representations of plans under various criteria, like efficient sequential or random access of actions. In addition to this, we show that our results have consequences for what can be gained from reformulating planning into some other problem. As a contrast to this we also prove a number of positive results, demonstrating restricted cases where plans do have useful compact representations, as well as proving that macro plans have favourable access properties. Our results are finally discussed in relation to other relevant contexts.
We would like to learn a representation of the data that reflects the semantics behind a specific grouping of the data, where within a group the samples share a common factor of variation. For example, consider a set of face images grouped by identity. We wish to anchor the semantics of the grouping into a disentangled representation that we can exploit. However, existing deep probabilistic models often assume that the samples are independent and identically distributed, thereby disregard the grouping information. We present the Multi-Level Variational Autoencoder (ML-VAE), a new deep probabilistic model for learning a disentangled representation of grouped data.
Learning useful representations without supervision remains a key challenge in machine learning. In this paper, we propose a simple yet powerful generative model that learns such discrete representations. Our model, the Vector Quantised-Variational AutoEncoder (VQ-VAE), differs from VAEs in two key ways: the encoder network outputs discrete, rather than continuous, codes; and the prior is learnt rather than static. In order to learn a discrete latent representation, we incorporate ideas from vector quantisation (VQ). Using the VQ method allows the model to circumvent issues of posterior collapse'' --- where the latents are ignored when they are paired with a powerful autoregressive decoder --- typically observed in the VAE framework.