AITopics | Question Answering

Collaborating Authors

Question Answering

"Questions are asked and answered every day. Question answering (QA) technology aims to deliver the same facility online. It goes further than the more familiar search based on keywords (as in Google, Yahoo, and other search engines), in attempting to recognize what a question expresses and to respond with an actual answer. This simplifies things for users in two ways. First, questions do not often translate into a simple list of keywords. ...Second, QA takes responsibility for providing answers, rather than a searchable list of links to potentially relevant documents (web pages), highlighted by snippets of text that show how the query matched the documents."
– from Bonnie Webber & Nick Webb. Question Answering. In The Handbook of Computational Linguistics and Natural Language Processing. Alexander Clark, Chris Fox, Shalom Lappin (Eds.). Wiley, 2010.

News Overviews Instructional Materials AI-Alerts Classics

Diverse Visuo-Lingustic Question Answering (DVLQA) Challenge

Sampat, Shailaja, Yang, Yezhou, Baral, Chitta

arXiv.org Artificial IntelligenceMay-1-2020

Existing question answering datasets mostly contain homogeneous contexts, based on either textual or visual information alone. On the other hand, digitalization has evolved the nature of reading which often includes integrating information across multiple heterogeneous sources. To bridge the gap between two, we compile a Diverse Visuo-Lingustic Question Answering (DVLQA) challenge corpus, where the task is to derive joint inference about the given image-text modality in a question answering setting. Each dataset item consists of an image and a reading passage, where questions are designed to combine both visual and textual information, i.e. ignoring either of them would make the question unanswerable. We first explore the combination of best existing deep learning architectures for visual question answering and machine comprehension to solve DVLQA subsets and show that they are unable to reason well on the joint task. We then develop a modular method which demonstrates slightly better baseline performance and offers more transparency for interpretation of intermediate outputs. However, this is still far behind the human performance, therefore we believe DVLQA will be a challenging benchmark for question answering involving reasoning over visuo-linguistic context. The dataset, code and public leaderboard will be made available at https://github.com/shailaja183/DVLQA.

dataset, information, reasoning, (16 more...)

arXiv.org Artificial Intelligence

2005.0033

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Arizona (0.04)

Genre: Research Report (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Robust Question Answering Through Sub-part Alignment

Chen, Jifan, Durrett, Greg

arXiv.org Artificial IntelligenceMay-1-2020

Current textual question answering models achieve strong performance on in-domain test sets, but often do so by fitting surface-level patterns in the data, so they fail to generalize to out-of-distribution settings. To make a more robust and understandable QA system, we model question answering as an alignment problem. We decompose both the question and context into smaller units based on off-the-shelf semantic representations (here, semantic roles), and align the question to a subgraph of the context in order to find the answer. We formulate our model as a structured SVM, with alignment scores computed via BERT, and we can train end-to-end despite using beam search for approximate inference. Our explicit use of alignments allows us to explore a set of constraints with which we can prohibit certain types of bad model behavior arising in cross-domain settings. Furthermore, by investigating differences in scores across different potential answers, we can seek to understand what particular aspects of the input lead the model to choose the answer without relying on post-hoc explanation techniques. We train our model on SQuAD v1.1 and test it on several adversarial and out-of-domain datasets. The results show that our model is more robust cross-domain than the standard BERT QA model, and constraints derived from alignment scores allow us to effectively trade off coverage and accuracy.

alignment, constraint, node, (14 more...)

arXiv.org Artificial Intelligence

2004.14648

Country:

North America > United States > Texas > Travis County > Austin (0.04)
Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(2 more...)

Genre: Research Report (0.69)

Industry: Leisure & Entertainment > Sports > Football (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)

Add feedback

Challenge Closed-book Science Exam: A Meta-learning Based Question Answering System

Zheng, Xinyue, Wang, Peng, Wang, Qigang, Shi, Zhongchao

arXiv.org Artificial IntelligenceApr-26-2020

Prior work in standardized science exams requires support from large text corpus, such as targeted science corpus from Wikipedia or SimpleWikipedia. However, retrieving knowledge from the large corpus is time-consuming and questions embedded in complex semantic representation may interfere with retrieval. Inspired by the dual process theory in cognitive science, we propose a MetaQA framework, where system 1 is an intuitive meta-classifier and system 2 is a reasoning module. Specifically, our method based on meta-learning method and large language model BERT, which can efficiently solve science problems by learning from related example questions without relying on external knowledge bases. We evaluate our method on AI2 Reasoning Challenge (ARC), and the experimental results show that meta-classifier yields considerable classification performance on emerging question types. The information provided by meta-classifier significantly improves the accuracy of reasoning module from 46.6% to 64.2%, which has a competitive advantage over retrieval-based QA methods.

arxiv preprint arxiv, example question, information, (13 more...)

arXiv.org Artificial Intelligence

2004.12303

Country:

North America > United States (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Education > Curriculum > Subject-Specific Education (1.00)
Education > Assessment & Standards > Student Performance (0.89)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Add feedback

Event-QA: A Dataset for Event-Centric Question Answering over Knowledge Graphs

Costa, Tarcísio Souza, Gottschalk, Simon, Demidova, Elena

arXiv.org Artificial IntelligenceApr-24-2020

Semantic Question Answering (QA) is the key technology to facilitate intuitive user access to semantic information stored in knowledge graphs. Whereas most of the existing QA systems and datasets focus on entity-centric questions, very little is known about the performance of these systems in the context of events. As new event-centric knowledge graphs emerge, datasets for such questions gain importance. In this paper we present the Event-QA dataset for answering event-centric questions over knowledge graphs. Event-QA contains 1000 semantic queries and the corresponding English, German and Portuguese verbalisations for EventKG - a recently proposed event-centric knowledge graph with over 970 thousand events.

graph, knowledge graph, query, (17 more...)

arXiv.org Artificial Intelligence

2004.11861

Country:

South America > Uruguay (0.05)
South America > Argentina (0.04)
Europe > United Kingdom (0.04)
(4 more...)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Sports (0.68)

Technology:

Information Technology > Communications > Web > Semantic Web (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)

Add feedback

Rapidly Bootstrapping a Question Answering Dataset for COVID-19

Tang, Raphael, Nogueira, Rodrigo, Zhang, Edwin, Gupta, Nikhil, Cam, Phuong, Cho, Kyunghyun, Lin, Jimmy

arXiv.org Artificial IntelligenceApr-23-2020

We present CovidQA, the beginnings of a question answering dataset specifically designed for COVID-19, built by hand from knowledge gathered from Kaggle's COVID-19 Open Research Dataset Challenge. To our knowledge, this is the first publicly available resource of its type, and intended as a stopgap measure for guiding research until more substantial evaluation resources become available. While this dataset, comprising 124 question-article pairs as of the present version 0.1 release, does not have sufficient examples for supervised machine learning, we believe that it can be helpful for evaluating the zero-shot or transfer capabilities of existing models on topics specifically related to COVID-19. This paper describes our methodology for constructing the dataset and presents the effectiveness of a number of baselines, including term-based techniques and various transformer-based models. The dataset is available at http://covidqa.ai/

dataset, effectiveness, natural language question, (14 more...)

arXiv.org Artificial Intelligence

2004.11339

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
(8 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.72)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.54)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.40)

Add feedback

HybridQA: A Dataset of Multi-Hop Question Answering over Tabular and Textual Data

Chen, Wenhu, Zha, Hanwen, Chen, Zhiyu, Xiong, Wenhan, Wang, Hong, Wang, William

arXiv.org Artificial IntelligenceApr-15-2020

Existing question answering datasets focus on dealing with homogeneous information, based either only on text or KB/Table information alone. However, as human knowledge is distributed over heterogeneous forms, using homogeneous information might lead to severe coverage problems. To fill in the gap, we present \dataset, a new large-scale question-answering dataset that requires reasoning on heterogeneous information. Each question is aligned with a structured Wikipedia table and multiple free-form corpora linked with the entities in the table. The questions are designed to aggregate both tabular information and text information, i.e. lack of either form would render the question unanswerable. We test with three different models: 1) table-only model. 2) text-only model. 3) a hybrid model \model which combines both table and textual information to build a reasoning path towards the answer. The experimental results show that the first two baselines obtain compromised scores below 20\%, while \model significantly boosts EM score to over 50\%, which proves the necessity to aggregate both structure and unstructured information in \dataset. However, \model's score is still far behind human performance, hence we believe \dataset to an ideal and challenging benchmark to study question answering under heterogeneous information. The dataset and code are available at \url{https://github.com/wenhuchen/HybridQA}.

dataset, information, proceedings, (15 more...)

arXiv.org Artificial Intelligence

2004.07347

Country:

North America > United States > California > Santa Barbara County > Santa Barbara (0.14)
Asia > Myanmar > Ayeyarwady Region > Pathein (0.04)
Asia > China > Beijing > Beijing (0.04)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Leisure & Entertainment > Sports > Olympic Games (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)

Add feedback

Explaining Question Answering Models through Text Generation

Latcinnik, Veronica, Berant, Jonathan

arXiv.org Artificial IntelligenceApr-12-2020

Large pre-trained language models (LMs) have been shown to perform surprisingly well when fine-tuned on tasks that require commonsense and world knowledge. However, in end-to-end architectures, it is difficult to explain what is the knowledge in the LM that allows it to make a correct prediction. In this work, we propose a model for multi-choice question answering, where a LM-based generator generates a textual hypothesis that is later used by a classifier to answer the question. The hypothesis provides a window into the information used by the fine-tuned LM that can be inspected by humans. A key challenge in this setup is how to constrain the model to generate hypotheses that are meaningful to humans. We tackle this by (a) joint training with a simple similarity classifier that encourages meaningful hypotheses, and (b) by adding loss functions that encourage natural text without repetitions. We show on several tasks that our model reaches performance that is comparable to end-to-end architectures, while producing hypotheses that elucidate the knowledge used by the LM for answering the question.

classifier, hypothesis, knowledge, (16 more...)

arXiv.org Artificial Intelligence

2004.05569

Country:

North America > United States > New Hampshire (0.04)
North America > United States > New Mexico (0.04)
North America > United States > Michigan (0.04)
(5 more...)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.61)

Add feedback

Natural Perturbation for Robust Question Answering

Khashabi, Daniel, Khot, Tushar, Sabharwal, Ashish

arXiv.org Artificial IntelligenceApr-9-2020

While recent models have achieved human-level scores on many NLP datasets, we observe that they are considerably sensitive to small changes in input. As an alternative to the standard approach of addressing this issue by constructing training sets of completely new examples, we propose doing so via minimal perturbation of examples. Specifically, our approach involves first collecting a set of seed examples and then applying human-driven natural perturbations (as opposed to rule-based machine perturbations), which often change the gold label as well. Local perturbations have the advantage of being relatively easier (and hence cheaper) to create than writing out completely new examples. To evaluate the impact of this phenomenon, we consider a recent question-answering dataset (BoolQ) and study the benefit of our approach as a function of the perturbation cost ratio, the relative cost of perturbing an existing question vs. creating a new one from scratch. We find that when natural perturbations are moderately cheaper to create, it is more effective to train models using them: such models exhibit higher robustness and better generalization, while retaining performance on the original BoolQ dataset.

boolq, dataset, perturbation, (15 more...)

arXiv.org Artificial Intelligence

2004.04849

Country:

Oceania > Australia (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > Canada (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.61)

Add feedback

Variational Question-Answer Pair Generation for Machine Reading Comprehension

Shinoda, Kazutoshi, Aizawa, Akiko

arXiv.org Artificial IntelligenceApr-7-2020

We present a deep generative model of question-answer (QA) pairs for machine reading comprehension. We introduce two independent latent random variables into our model in order to diversify answers and questions separately. We also study the effect of explicitly controlling the KL term in the variational lower bound in order to avoid the "posterior collapse" issue, where the model ignores latent variables and generates QA pairs that are almost the same. Our experiments on SQuAD v1.1 showed that variational methods can aid QA pair modeling capacity, and that the controlled KL term can significantly improve diversity while generating high-quality questions and answers comparable to those of the existing systems.

computational linguistic, machine learning, question answering, (18 more...)

arXiv.org Artificial Intelligence

2004.03238

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > China > Hong Kong (0.05)
(18 more...)

Genre: Research Report (0.40)

Industry:

Media > Music (0.94)
Leisure & Entertainment (0.94)
Education > Assessment & Standards > Student Performance (0.61)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.94)

Add feedback

Guessing What's Plausible But Remembering What's True: Accurate Neural Reasoning for Question-Answering

Sun, Haitian, Arnold, Andrew O., Bedrax-Weiss, Tania, Pereira, Fernando, Cohen, William W.

arXiv.org Machine LearningApr-7-2020

Neural approaches to natural language processing (NLP) often fail at the logical reasoning needed for deeper language understanding. In particular, neural approaches to reasoning that rely on embedded \emph{generalizations} of a knowledge base (KB) implicitly model which facts that are \emph{plausible}, but may not model which facts are \emph{true}, according to the KB. While generalizing the facts in a KB is useful for KB completion, the inability to distinguish between plausible inferences and logically entailed conclusions can be problematic in settings like as KB question answering (KBQA). We propose here a novel KB embedding scheme that supports generalization, but also allows accurate logical reasoning with a KB. Our approach introduces two new mechanisms for KB reasoning: neural retrieval over a set of embedded triples, and "memorization" of highly specific information with a compact sketch structure. Experimentally, this leads to substantial improvements over the state-of-the-art on two KBQA benchmarks.

opération, representation, sketch, (16 more...)

arXiv.org Machine Learning

2004.03658

Country:

North America > United States (0.14)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.64)

Industry:

Media > Film (0.68)
Leisure & Entertainment (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.61)

Add feedback