Goto

Collaborating Authors

 Sensing and Signal Processing








POP-3D: Open-Vocabulary 3D Occupancy Prediction from Images David Hurych 1

Neural Information Processing Systems

We describe an approach to predict open-vocabulary 3D semantic voxel occupancy map from input 2D images with the objective of enabling 3D grounding, segmentation and retrieval of free-form language queries. This is a challenging problem because of the 2D-3D ambiguity and the open-vocabulary nature of the target tasks, where obtaining annotated training data in 3D is difficult. The contributions of this work are three-fold.



SELECT: A Large-Scale Benchmark of Data Curation Strategies for Image Classification Benjamin Feuer

Neural Information Processing Systems

Data curation is the problem of how to collect and organize samples into a dataset that supports efficient learning. Despite the centrality of the task, little work has been devoted towards a large-scale, systematic comparison of various curation methods.


Adaptive Visual Scene Understanding: Incremental Scene Graph Generation College of Computing and Data Science, Nanyang Technological University (NTU), Singapore

Neural Information Processing Systems

Scene graph generation (SGG) analyzes images to extract meaningful information about objects and their relationships. In the dynamic visual world, it is crucial for AI systems to continuously detect new objects and establish their relationships with existing ones. Recently, numerous studies have focused on continual learning within the domains of object detection and image recognition. However, a limited amount of research focuses on a more challenging continual learning problem in SGG. This increased difficulty arises from the intricate interactions and dynamic relationships among objects, and their associated contexts. Thus, in continual learning, SGG models are often required to expand, modify, retain, and reason scene graphs within the process of adaptive visual scene understanding.