AITopics | hico-det

Collaborating Authors

hico-det

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Towards Flexible Visual Relationship Segmentation

Neural Information Processing SystemsMar-22-2026, 09:36:05 GMT

Visual relationship understanding has been studied separately in human-object interaction(HOI) detection, scene graph generation(SGG), and referring relationships(RR) tasks. Given the complexity and interconnectedness of these tasks, it is crucial to have a flexible framework that can effectively address these tasks in a cohesive manner.In this work, we propose FleVRS, a single model that seamlessly integrates the above three aspects in standard and promptable visual relationship segmentation, and further possesses the capability for open-vocabulary segmentation to adapt to novel scenarios. FleVRS leverages the synergy between text and image modalities, to ground various types of relationships from images and use textual features from vision-language models to visual conceptual understanding.Empirical validation across various datasets demonstrates that our framework outperforms existing models in standard, promptable, and open-vocabulary tasks, e.g., +1.9 $mAP$ on HICO-DET, +11.4 $Acc$ on VRD, +4.7 $mAP$ on unseen HICO-DET.Our FleVRS represents a significant step towards a more intuitive, comprehensive, and scalable understanding of visual relationships.

artificial intelligence, name change, proceedings, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

f37347375d8b54e3203e5d24aeb6c58c-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 20:31:15 GMT

annotation, hico-det, rlip-parsed, (13 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Singapore (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

UniHOI: Unified Human-Object Interaction Understanding via Unified Token Space

Yang, Panqi, Jing, Haodong, Zheng, Nanning, Ma, Yongqiang

arXiv.org Artificial IntelligenceNov-20-2025

In the field of human-object interaction (HOI), detection and generation are two dual tasks that have traditionally been addressed separately, hindering the development of comprehensive interaction understanding. To address this, we propose UniHOI, which jointly models HOI detection and generation via a unified token space, thereby effectively promoting knowledge sharing and enhancing generalization. Specifically, we introduce a symmetric interaction-aware attention module and a unified semi-supervised learning paradigm, enabling effective bidirectional mapping between images and interaction semantics even under limited annotations. Extensive experiments demonstrate that UniHOI achieves state-of-the-art performance in both HOI detection and generation. Specifically, UniHOI improves accuracy by 4.9% on long-tailed HOI detection and boosts interaction metrics by 42.0% on open-vocabulary generation tasks.

artificial intelligence, hoi detection, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2511.15046

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.31)

Add feedback

f37347375d8b54e3203e5d24aeb6c58c-Paper-Conference.pdf

Neural Information Processing SystemsSep-27-2025, 15:48:48 GMT

artificial intelligence, computer vision, machine learning, (14 more...)

Neural Information Processing Systems

Country: Asia (0.14)

Industry: Energy > Oil & Gas (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Towards Flexible Visual Relationship Segmentation

Neural Information Processing SystemsMay-27-2025, 15:28:53 GMT

Visual relationship understanding has been studied separately in human-object interaction(HOI) detection, scene graph generation(SGG), and referring relationships(RR) tasks. Given the complexity and interconnectedness of these tasks, it is crucial to have a flexible framework that can effectively address these tasks in a cohesive manner.In this work, we propose FleVRS, a single model that seamlessly integrates the above three aspects in standard and promptable visual relationship segmentation, and further possesses the capability for open-vocabulary segmentation to adapt to novel scenarios. FleVRS leverages the synergy between text and image modalities, to ground various types of relationships from images and use textual features from vision-language models to visual conceptual understanding.Empirical validation across various datasets demonstrates that our framework outperforms existing models in standard, promptable, and open-vocabulary tasks, e.g., 1.9 mAP on HICO-DET, 11.4 Acc on VRD, 4.7 mAP on unseen HICO-DET.Our FleVRS represents a significant step towards a more intuitive, comprehensive, and scalable understanding of visual relationships.

artificial intelligence, flexible visual relationship segmentation, hico-det

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

RLIPv2: Fast Scaling of Relational Language-Image Pre-training

Yuan, Hangjie, Zhang, Shiwei, Wang, Xiang, Albanie, Samuel, Pan, Yining, Feng, Tao, Jiang, Jianwen, Ni, Dong, Zhang, Yingya, Zhao, Deli

arXiv.org Artificial IntelligenceAug-18-2023

Relational Language-Image Pre-training (RLIP) aims to align vision representations with relational texts, thereby advancing the capability of relational reasoning in computer vision tasks. However, hindered by the slow convergence of RLIPv1 architecture and the limited availability of existing scene graph data, scaling RLIPv1 is challenging. In this paper, we propose RLIPv2, a fast converging model that enables the scaling of relational pre-training to large-scale pseudo-labelled scene graph data. To enable fast scaling, RLIPv2 introduces Asymmetric Language-Image Fusion (ALIF), a mechanism that facilitates earlier and deeper gated cross-modal fusion with sparsified language encoding layers. ALIF leads to comparable or better performance than RLIPv1 in a fraction of the time for pre-training and fine-tuning. To obtain scene graph data at scale, we extend object detection datasets with free-form relation labels by introducing a captioner (e.g., BLIP) and a designed Relation Tagger. The Relation Tagger assigns BLIP-generated relation texts to region pairs, thus enabling larger-scale relational pre-training. Through extensive experiments conducted on Human-Object Interaction Detection and Scene Graph Generation, RLIPv2 shows state-of-the-art performance on three benchmarks under fully-finetuning, few-shot and zero-shot settings. Notably, the largest RLIPv2 achieves 23.29mAP on HICO-DET without any fine-tuning, yields 32.22mAP with just 1% data and yields 45.09mAP with 100% data. Code and models are publicly available at https://github.com/JacobYuan7/RLIPv2.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2308.09351

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > Singapore (0.04)
Asia > China (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.50)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.34)

Add feedback

Transferable Interactiveness Knowledge for Human-Object Interaction Detection

Li, Yong-Lu, Liu, Xinpeng, Wu, Xiaoqian, Huang, Xijie, Xu, Liang, Lu, Cewu

arXiv.org Artificial IntelligenceJan-25-2021

Abstract--Human-Object Interaction (HOI) detection is an important problem to understand how humans interact with objects. In this paper, we explore interactiveness knowledge which indicates whether a human and an object interact with each other or not. We found that interactiveness knowledge can be learned across HOI datasets and bridge the gap between diverse HOI category settings. Our core idea is to exploit an interactiveness network to learn the general interactiveness knowledge from multiple HOI datasets and perform Non-Interaction Suppression (NIS) before HOI classification in inference. On account of the generalization ability of interactiveness, interactiveness network is a transferable knowledge learner and can be cooperated with any HOI detection models to achieve desirable results. We utilize the human instance and body part features together to learn the interactiveness in hierarchical paradigm, i.e., instance-level and body part-level interactivenesses. Thereafter, a consistency task is proposed to guide the learning and extract deeper interactive visual clues. We extensively evaluate the proposed method on HICO-DET, V-COCO, and a newly constructed PaStaNet-HOI dataset. With the learned interactiveness, our method outperforms state-of-the-art HOI detection methods, verifying its efficacy and flexibility.

detection, interactiveness, interactiveness knowledge, (14 more...)

arXiv.org Artificial Intelligence

2101.10292

Country:

Asia > China > Shanghai > Shanghai (0.05)
Asia > China > Jiangsu Province > Nanjing (0.04)
Asia > China > Hong Kong (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry: Transportation (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback