AITopics | blockgan

The computer graphics pipeline has achieved impressive results in generating high-quality images, while offering users a great level of freedom and controllability over the generated images. This has many applications in creating and editing content for the creative industries, such as films, games, scientific visualisation, and more recently, in generating training data for computer vision tasks.

artificial intelligence, blockgan, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images

Neural Information Processing SystemsDec-24-2025, 00:33:49 GMT

We present BlockGAN, an image generative model that learns object-aware 3D scene representations directly from unlabelled 2D images. Current work on scene representation learning either ignores scene background or treats the whole scene as one object. Meanwhile, work that considers scene compositionality treats scene objects only as image patches or 2D layers with alpha maps. Inspired by the computer graphics pipeline, we design BlockGAN to learn to first generate 3D features of background and foreground objects, then combine them into 3D features for the whole scene, and finally render them into realistic images. This allows BlockGAN to reason over occlusion and interaction between objects' appearance, such as shadow and lighting, and provides control over each object's 3D pose and identity, while maintaining image realism. BlockGAN is trained end-to-end, using only unlabelled single images, without the need for 3D geometry, pose labels, object masks, or multiple views of the same scene. Our experiments show that using explicit 3D features to represent objects allows BlockGAN to learn disentangled representations both in terms of objects (foreground and background) and their properties (pose and identity).

blockgan, name change, object-aware scene representation, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.40)

Add feedback

BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images

Thu Nguyen-Phuoc, Christian Richardt, Long Mai, Yong-Liang Yang, Niloy Mitra

Neural Information Processing SystemsOct-2-2025, 20:53:30 GMT

The computer graphics pipeline has achieved impressive results in generating high-quality images, while offering users a great level of freedom and controllability over the generated images. This has many applications in creating and editing content for the creative industries, such as films, games, scientific visualisation, and more recently, in generating training data for computer vision tasks.

artificial intelligence, blockgan, machine learning, (19 more...)

Neural Information Processing Systems

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

BlockGAN is a generative model that learns 3D object-aware scene representations using only unlabelled images

Neural Information Processing SystemsOct-2-2025, 20:53:18 GMT

Claims: Although pose is an input to our model, no GT pose labels were used for training.

blockgan, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.51)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.42)

Add feedback

BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images

Neural Information Processing SystemsMay-26-2025, 22:27:04 GMT

We present BlockGAN, an image generative model that learns object-aware 3D scene representations directly from unlabelled 2D images. Current work on scene representation learning either ignores scene background or treats the whole scene as one object. Meanwhile, work that considers scene compositionality treats scene objects only as image patches or 2D layers with alpha maps. Inspired by the computer graphics pipeline, we design BlockGAN to learn to first generate 3D features of background and foreground objects, then combine them into 3D features for the whole scene, and finally render them into realistic images. This allows BlockGAN to reason over occlusion and interaction between objects' appearance, such as shadow and lighting, and provides control over each object's 3D pose and identity, while maintaining image realism. BlockGAN is trained end-to-end, using only unlabelled single images, without the need for 3D geometry, pose labels, object masks, or multiple views of the same scene.

artificial intelligence, blockgan, object-aware scene representation, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.43)

Add feedback

Review for NeurIPS paper: BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images

Neural Information Processing SystemsJan-24-2025, 04:14:30 GMT

The 3D features are then composed through taking their element-wise maximum and then concatenation to form a "unified 3D scene representation". For rendering, the unified scene representation is fed to another neural network (the renderer) which also takes as input the camera parameters and produces a rendering of the scene. The model is trained with a GAN-based objective. The authors evaluate their model on multiple datasets such as CLEVR, Real-Car and Synthetic cars and chairs. The main experiments they conduct is as follow 1) To show that their model has learned disentangled representations (i.e. they change attributes of the foreground objects or the background and show the renderings which reflect those changes) 2) To evaluate visual fidelity, they show that their model achieves nearly the same or lower KID estimates compared to other methods for all datasets 3) They train a model on scenes with 1 object but show that they can use that model to generate scenes up to 5 objects of the same category with different attributes 4) They show that they can manipulate some of the attributes (e.g.

neural network, object-aware scene representation, unlabelled image, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.31)

Add feedback

BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images

Neural Information Processing SystemsOct-10-2024, 03:48:44 GMT

We present BlockGAN, an image generative model that learns object-aware 3D scene representations directly from unlabelled 2D images. Current work on scene representation learning either ignores scene background or treats the whole scene as one object. Meanwhile, work that considers scene compositionality treats scene objects only as image patches or 2D layers with alpha maps. Inspired by the computer graphics pipeline, we design BlockGAN to learn to first generate 3D features of background and foreground objects, then combine them into 3D features for the whole scene, and finally render them into realistic images. This allows BlockGAN to reason over occlusion and interaction between objects' appearance, such as shadow and lighting, and provides control over each object's 3D pose and identity, while maintaining image realism. BlockGAN is trained end-to-end, using only unlabelled single images, without the need for 3D geometry, pose labels, object masks, or multiple views of the same scene.

blockgan, object-aware scene representation, unlabelled image, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.43)

Add feedback