Emergence of Object Segmentation in Perturbed Generative Models

Jun-1-2025, 09:00:59 GMT–Neural Information Processing Systems

We introduce a framework to learn object segmentation from a collection of images without any manual annotation. We build on the observation that the location of object segments can be perturbed locally relative to a given background without affecting the realism of a scene. First, we train a generative model of a layered scene. The layered representation consists of a background image, a foreground image and the mask of the foreground. A composite image is then obtained by overlaying the masked foreground image onto the background. The generative model is trained in an adversarial fashion against a discriminator, which forces the generative model to produce realistic composite images. To force the generator to learn a representation where the foreground layer corresponds to an object, we perturb the output of the generative model by introducing a random shift of both the foreground image and mask relative to the background. Because the generator is unaware of the shift before computing its output, it must produce layered representations that are realistic for any such random perturbation. Second, we learn to segment an image by defining an autoencoder consisting of an encoder, which we train, and the pretrained generator as the decoder, which we fix.

machine learning, natural language, segmentation, (18 more...)

Neural Information Processing Systems

Jun-1-2025, 09:00:59 GMT

Conferences PDF

Add feedback

Country:
- North America > Canada (0.14)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.68)
  - Natural Language > Generation (1.00)
  - Vision (1.00)