Joint Modeling of Visual Objects and Relations for Scene Graph Generation

Oct-10-2024, 03:21:00 GMT–Neural Information Processing Systems

An in-depth scene understanding usually requires recognizing all the objects and their relations in an image, encoded as a scene graph. Most existing approaches for scene graph generation first independently recognize each object and then predict their relations independently. Though these approaches are very efficient, they ignore the dependency between different objects as well as between their relations. In this paper, we propose a principled approach to jointly predict the entire scene graph by fully capturing the dependency between different objects and between their relations. Specifically, we establish a unified conditional random field (CRF) to model the joint distribution of all the objects and their relations in a scene graph.

relation, scene graph generation, visual object and relation, (4 more...)

Neural Information Processing Systems

Oct-10-2024, 03:21:00 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.43)