AITopics | toist

TOIST: TaskOrientedInstanceSegmentation TransformerwithNoun-PronounDistillation SupplementaryMaterial

Neural Information Processing SystemsFeb-9-2026, 17:35:38 GMT

As mentioned in Section 3(formulation) of the main paper, in an input image, it is possible that no objects or multiple objects afford a specific task. As areminder,we use the whole verb-pronoun (or verb-noun) description as token span. With probability 0.5, an image is cropped to a random size, where each side is between384and1333pixels. Both of the student and teacher TOIST models are initialized with the model pre-trained by [4]. In an image, the most suitable objects (one or more) for solving the task are selected and their bounding boxes are taken as ground truth labels for detection.

artificial intelligence, detection, specifiedclassesineachtask, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.54)
Information Technology > Artificial Intelligence > Vision (0.51)

Add feedback

TOIST: TaskOrientedInstanceSegmentation TransformerwithNoun-PronounDistillation

Neural Information Processing SystemsFeb-9-2026, 17:35:33 GMT

Towardsafinerlocalization thatbetterservesdownstream applications likerobot interaction, we extend the problem into task oriented instance segmentation.

machine learning, natural language, toist, (20 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Vision (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

TOIST: Task Oriented Instance Segmentation Transformer with Noun-Pronoun Distillation

Neural Information Processing SystemsDec-24-2025, 10:36:42 GMT

Current referring expression comprehension algorithms can effectively detect or segment objects indicated by nouns, but how to understand verb reference is still under-explored. As such, we study the challenging problem of task oriented detection, which aims to find objects that best afford an action indicated by verbs like sit comfortably on. Towards a finer localization that better serves downstream applications like robot interaction, we extend the problem into task oriented instance segmentation. A unique requirement of this task is to select preferred candidates among possible alternatives. Thus we resort to the transformer architecture which naturally models pair-wise query relationships with attention, leading to the TOIST method. In order to leverage pre-trained noun referring expression comprehension models and the fact that we can access privileged noun ground truth during training, a novel noun-pronoun distillation framework is proposed. Noun prototypes are generated in an unsupervised manner and contextual pronoun features are trained to select prototypes. As such, the network remains noun-agnostic during inference. We evaluate TOIST on the large-scale task oriented dataset COCO-Tasks and achieve +10.7% higher $\rm{mAP^{box}}$ than the best-reported results.

name change, noun-pronoun distillation, segmentation transformer, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback

70270a1bc28ecb2a2aefad566c5e556b-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-15-2025, 18:12:18 GMT

detection, precision-recall curve, test data, (17 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)

Add feedback

TOIST: Task Oriented Instance Segmentation Transformer with Noun-Pronoun Distillation

Neural Information Processing SystemsAug-15-2025, 18:12:12 GMT

Noun prototypes are generated in an unsupervised manner and contextual pronoun features are trained to select prototypes. As such, the network remains noun-agnostic during inference.

artificial intelligence, machine learning, natural language, (14 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Genre:

Research Report (0.68)
Overview (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
(4 more...)

Add feedback

TOIST: Task Oriented Instance Segmentation Transformer with Noun-Pronoun Distillation

Neural Information Processing SystemsOct-11-2024, 13:34:23 GMT

Current referring expression comprehension algorithms can effectively detect or segment objects indicated by nouns, but how to understand verb reference is still under-explored. As such, we study the challenging problem of task oriented detection, which aims to find objects that best afford an action indicated by verbs like sit comfortably on. Towards a finer localization that better serves downstream applications like robot interaction, we extend the problem into task oriented instance segmentation. A unique requirement of this task is to select preferred candidates among possible alternatives. Thus we resort to the transformer architecture which naturally models pair-wise query relationships with attention, leading to the TOIST method.

noun-pronoun distillation, segmentation transformer, toist, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.42)

Add feedback

TOIST: Task Oriented Instance Segmentation Transformer with Noun-Pronoun Distillation

Li, Pengfei, Tian, Beiwen, Shi, Yongliang, Chen, Xiaoxue, Zhao, Hao, Zhou, Guyue, Zhang, Ya-Qin

arXiv.org Artificial IntelligenceOct-19-2022

Current referring expression comprehension algorithms can effectively detect or segment objects indicated by nouns, but how to understand verb reference is still under-explored. As such, we study the challenging problem of task oriented detection, which aims to find objects that best afford an action indicated by verbs like sit comfortably on. Towards a finer localization that better serves downstream applications like robot interaction, we extend the problem into task oriented instance segmentation. A unique requirement of this task is to select preferred candidates among possible alternatives. Thus we resort to the transformer architecture which naturally models pair-wise query relationships with attention, leading to the TOIST method. In order to leverage pre-trained noun referring expression comprehension models and the fact that we can access privileged noun ground truth during training, a novel noun-pronoun distillation framework is proposed. Noun prototypes are generated in an unsupervised manner and contextual pronoun features are trained to select prototypes. As such, the network remains noun-agnostic during inference. We evaluate TOIST on the large-scale task oriented dataset COCO-Tasks and achieve +10.9% higher $\rm{mAP^{box}}$ than the best-reported results. The proposed noun-pronoun distillation can boost $\rm{mAP^{box}}$ and $\rm{mAP^{mask}}$ by +2.8% and +3.8%. Codes and models are publicly available at https://github.com/AIR-DISCOVER/TOIST.

distillation, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2210.10775

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.68)

Industry:

Education (0.46)
Information Technology (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
(4 more...)

Add feedback

Filters

Collaborating Authors

toist

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

TOIST: TaskOrientedInstanceSegmentation TransformerwithNoun-PronounDistillation SupplementaryMaterial

TOIST: TaskOrientedInstanceSegmentation TransformerwithNoun-PronounDistillation

TOIST: Task Oriented Instance Segmentation Transformer with Noun-Pronoun Distillation

70270a1bc28ecb2a2aefad566c5e556b-Supplemental-Conference.pdf

TOIST: Task Oriented Instance Segmentation Transformer with Noun-Pronoun Distillation

TOIST: Task Oriented Instance Segmentation Transformer with Noun-Pronoun Distillation

TOIST: Task Oriented Instance Segmentation Transformer with Noun-Pronoun Distillation