Goto

Collaborating Authors

 Question Answering


AmbigQA: Answering Ambiguous Open-domain Questions

arXiv.org Artificial Intelligence

Ambiguity is inherent to open-domain question answering; especially when exploring new topics, it can be difficult to ask questions that have a single, unambiguous answer. In this paper, we introduce AmbigQA, a new open-domain question answering task which involves finding every plausible answer, and then rewriting the question for each one to resolve the ambiguity. To study this task, we construct AmbigNQ, a dataset covering 14,042 questions from NQ-open, an existing open-domain QA benchmark. We find that over half of the questions in NQ-open are ambiguous, with diverse sources of ambiguity such as event and entity references. We also present strong baseline models for AmbigQA which we show benefit from weakly supervised learning that incorporates NQ-open, strongly suggesting our new task and data will support significant future research effort. Our data and baselines are available at https://nlp.cs.washington.edu/ambigqa.


What Disease does this Patient Have? A Large-scale Open Domain Question Answering Dataset from Medical Exams

arXiv.org Artificial Intelligence

Open domain question answering (OpenQA) tasks have been recently attracting more and more attention from the natural language processing (NLP) community. In this work, we present the first free-form multiple-choice OpenQA dataset for solving medical problems, MedQA, collected from the professional medical board exams. It covers three languages: English, simplified Chinese, and traditional Chinese, and contains 12,723, 34,251, and 14,123 questions for the three languages, respectively. We implement both rule-based and popular neural methods by sequentially combining a document retriever and a machine comprehension model. Through experiments, we find that even the current best method can only achieve 36.7\%, 42.0\%, and 70.1\% of test accuracy on the English, traditional Chinese, and simplified Chinese questions, respectively. We expect MedQA to present great challenges to existing OpenQA systems and hope that it can serve as a platform to promote much stronger OpenQA models from the NLP community in the future.


ForecastQA: A Question Answering Challenge for Event Forecasting

arXiv.org Machine Learning

Event forecasting is a challenging, yet consequential task, as humans seek to constantly plan for the future. Existing automated forecasting approaches rely mostly on structured data, such as time-series or event-based knowledge graphs, to help predict future events. In this work, we formulate the forecasting problem as a restricted-domain, multiple-choice, question-answering (QA) task that simulates the forecasting scenario. To showcase the usefulness of this task formulation, we introduce a dataset ForecastQA, a question-answering dataset consisting of 10,392 event forecasting questions, which have been collected and verified via crowdsourcing efforts. We also present our experiments on ForecastQA using BERT-based models and find that our best model achieves 61.0\% accuracy on the dataset, which is still far behind human performance by about 18%. We hope ForecastQA will support future research efforts in bridging this gap.


How To Migrate Your Chatbot From IBM Watson Assistant To Rasa

#artificialintelligence

IBM Watson Assistant (WA), at its core has a basic intent and entity structure. Intents are as minimalist as can be. During the intent creation process, there are two features which aid in the defining of intents. Bot of these features translate into better defined intents, and translates nicely into the JSON export file. Hence the leverage these functions lend to the intent creation process is not lost.


IBM's Watson Assistant can now field election questions

#artificialintelligence

Ahead of the U.S. presidential election on November 3, IBM today announced it's working with states to put information into the hands of potential voters. Using the AI and natural language processing capabilities of Watson Assistant, IBM says it's helping field voter queries online and via phone by advising people on polling place locations, voting hours, procedures for requesting mail-in ballots, and deadlines. Research from the Pew Center indicates that nearly half of all U.S. voters expect to have difficulties casting a ballot due to the coronavirus pandemic. In a recent NPR/PBS NewsHour/Marist Poll, 41% of those surveyed said they believed the U.S. is not very prepared or not at all prepared to keep November's election safe and secure. IBM's election-focused Watson Assistant offering taps Watson Discovery to surface information about voting logistics from federal, state, and county websites; local news reports; and government documents.


Connecting the Dots: A Knowledgeable Path Generator for Commonsense Question Answering

arXiv.org Artificial Intelligence

Commonsense question answering (QA) requires background knowledge which is not explicitly stated in a given context. Prior works use commonsense knowledge graphs (KGs) to obtain this knowledge for reasoning. However, relying entirely on these KGs may not suffice, considering their limited coverage and the contextual dependence of their knowledge. In this paper, we augment a general commonsense QA framework with a knowledgeable path generator. By extrapolating over existing paths in a KG with a state-of-the-art language model, our generator learns to connect a pair of entities in text with a dynamic, and potentially novel, multi-hop relational path. Such paths can provide structured evidence for solving commonsense questions without fine-tuning the path generator. Experiments on two datasets show the superiority of our method over previous works which fully rely on knowledge from KGs (with up to 6% improvement in accuracy), across various amounts of training data. Further evaluation suggests that the generated paths are typically interpretable, novel, and relevant to the task.


Self-supervised Knowledge Triplet Learning for Zero-shot Question Answering

arXiv.org Artificial Intelligence

The aim of all Question Answering (QA) systems is to be able to generalize to unseen questions. Current supervised methods are reliant on expensive data annotation. Moreover, such annotations can introduce unintended annotator bias which makes systems focus more on the bias than the actual task. In this work, we propose Knowledge Triplet Learning (KTL), a self-supervised task over knowledge graphs. We propose heuristics to create synthetic graphs for commonsense and scientific knowledge. We propose methods of how to use KTL to perform zero-shot QA and our experiments show considerable improvements over large pre-trained transformer models.


REXUP: I REason, I EXtract, I UPdate with Structured Compositional Reasoning for Visual Question Answering

arXiv.org Artificial Intelligence

Visual Question Answering (VQA) is a challenging multimodal task that requires not only the semantic understanding of images and questions, but also the sound perception of a step-by-step reasoning process that would lead to the correct answer. So far, most successful attempts in VQA have been focused on only one aspect; either the interaction of visual pixel features of images and word features of questions, or the reasoning process of answering the question of an image with simple objects. In this paper, we propose a deep reasoning VQA model (REXUP-REason, EXtract, and UPdate) with explicit visual structureaware textual information, and it works well in capturing step-by-step reasoning process and detecting complex object-relationships in photorealistic images. REXUP consists of two branches, image object-oriented and scene graph-oriented, which jointly works with the super-diagonal fusion compositional attention networks. We evaluate REXUP on the benchmark GQA dataset and conduct extensive ablation studies to explore the reasons behind REXUPs effectiveness. Our best model significantly outperforms the previous state-of-the-art, which delivers 92.7% on the validation set, and 73.1% on the test-dev set.


Receptivity of an AI Cognitive Assistant by the Radiology Community: A Report on Data Collected at RSNA

arXiv.org Artificial Intelligence

Due to advances in machine learning and artificial intelligence (AI), a new role is emerging for machines as intelligent assistants to radiologists in their clinical workflows. But what systematic clinical thought processes are these machines using? Are they similar enough to those of radiologists to be trusted as assistants? A live demonstration of such a technology was conducted at the 2016 Scientific Assembly and Annual Meeting of the Radiological Society of North America (RSNA). The demonstration was presented in the form of a question-answering system that took a radiology multiple choice question and a medical image as inputs. The AI system then demonstrated a cognitive workflow, involving text analysis, image analysis, and reasoning, to process the question and generate the most probable answer. A post demonstration survey was made available to the participants who experienced the demo and tested the question answering system. Of the reported 54,037 meeting registrants, 2,927 visited the demonstration booth, 1,991 experienced the demo, and 1,025 completed a post-demonstration survey. In this paper, the methodology of the survey is shown and a summary of its results are presented. The results of the survey show a very high level of receptiveness to cognitive computing technology and artificial intelligence among radiologists.


QED: A Framework and Dataset for Explanations in Question Answering

arXiv.org Artificial Intelligence

A question answering system that in addition to providing an answer provides an explanation of the reasoning that leads to that answer has potential advantages in terms of debuggability, extensibility and trust. To this end, we propose QED, a linguistically informed, extensible framework for explanations in question answering. A QED explanation specifies the relationship between a question and answer according to formal semantic notions such as referential equality, sentencehood, and entailment. We describe and publicly release an expert-annotated dataset of QED explanations built upon a subset of the Google Natural Questions dataset, and report baseline models on two tasks -- post-hoc explanation generation given an answer, and joint question answering and explanation generation. In the joint setting, a promising result suggests that training on a relatively small amount of QED data can improve question answering. In addition to describing the formal, language-theoretic motivations for the QED approach, we describe a large user study showing that the presence of QED explanations significantly improves the ability of untrained raters to spot errors made by a strong neural QA baseline.