Zero-Shot Visual Slot Filling as Question Answering

Nov-24-2020–arXiv.org Artificial Intelligence

This paper presents a new approach to visual zero-shot slot filling. The approach extends previous approaches by reformulating the slot filling task as Question Answering. Slot tags are converted to rich natural language questions that capture the semantics of visual information and lexical text on the GUI screen. These questions are paired with the user's utterance and slots are extracted from the utterance using a state-of-the-art ALBERT-based Question Answering system trained on the Stanford Question Answering dataset (SQuaD2). An approach to further refine the model with multi-task training is presented. The multi-task approach facilitates the incorporation of a large number of successive refinements and transfer learning across similar tasks. A new Visual Slot dataset and a visual extension of the popular ATIS dataset is introduced to support research and experimentation on visual slot filling. Results show F1 scores between 0.52 and 0.60 on the Visual Slot and ATIS datasets with no training data (zero-shot).

dataset, heck, visual slot, (13 more...)

arXiv.org Artificial Intelligence

Nov-24-2020

arXiv.org PDF

Add feedback

Genre:
- Research Report > New Finding (0.49)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Question Answering (1.00)
    - Large Language Model (0.96)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found