Goto

Collaborating Authors

 bidaf


TASA: Deceiving Question Answering Models by Twin Answer Sentences Attack

Cao, Yu, Li, Dianqi, Fang, Meng, Zhou, Tianyi, Gao, Jun, Zhan, Yibing, Tao, Dacheng

arXiv.org Artificial Intelligence

We present Twin Answer Sentences Attack (TASA), an adversarial attack method for question answering (QA) models that produces fluent and grammatical adversarial contexts while maintaining gold answers. Despite phenomenal progress on general adversarial attacks, few works have investigated the vulnerability and attack specifically for QA models. In this work, we first explore the biases in the existing models and discover that they mainly rely on keyword matching between the question and context, and ignore the relevant contextual relations for answer prediction. Based on two biases above, TASA attacks the target model in two folds: (1) lowering the model's confidence on the gold answer with a perturbed answer sentence; (2) misguiding the model towards a wrong answer with a distracting answer sentence. Equipped with designed beam search and filtering methods, TASA can generate more effective attacks than existing textual attack methods while sustaining the quality of contexts, in extensive experiments on five QA datasets and human evaluations.


Google Trains Reinforcement Learning Agents to Ask the Right Questions

#artificialintelligence

I recently started an AI-focused educational newsletter, that already has over 80,000 subscribers. TheSequence is a no-BS (meaning no hype, no news etc) ML-oriented newsletter that takes 5 minutes to read. The goal is to keep you up to date with machine learning projects, research papers and concepts. That paradigm assumes that the target knowledge is already embedded in the dataset and doesn't require any further clarifications but that rarely resembles how humans learn. When presented with a new subject, we are constantly forced to ask questions and clarifications about it.


Training Reinforcement Learning Agents to Ask the Right Questions

#artificialintelligence

That paradigm assumes that the target knowledge is already embedded in the dataset and doesn't require any further clarifications but that rarely resembles how humans learn. When presented with a new subject, we are constantly forced to ask questions and clarifications about it. What if we could build the same skill into artificial intelligence(AI) models. The ability of formulate questions is a fundamental element of the human cognition process. The cornerstone of human's dialogs relies on our ability to express questions in a myriad of ways in order to obtain a specific answer.


Modeling and Output Layers in BiDAF -- an Illustrated Guide with Minions

#artificialintelligence

The output of the aforementioned attention step is a giant matrix called G. G is a 8d-by-T matrix that encodes the Query-aware representations of Context words. G is the input to the modeling layer, which will be the focus of this article. Ok, so I know we've been through a lot of steps in the past three articles. It is extremely easy to get lost in the myriad of symbols and equations, especially considering that the choices of symbols in the BiDAF paper aren't that "user friendly." I mean, do you still remember what each of H, U, Ĥ and Ũ represents?


Ensemble approach for natural language question answering problem

Aniol, Anna, Pietron, Marcin

arXiv.org Artificial Intelligence

Machine comprehension, answering a question depending on a given context paragraph is a typical task of Natural Language Understanding. It requires to model complex dependencies existing between the question and the context paragraph. There are many neural network models attempting to solve the problem of question answering. The best models have been selected, studied and compared with each other. All the selected models are based on the neural attention mechanism concept. Additionally, studies on a SQUAD dataset were performed. The subsets of queries were extracted and then each model was analyzed how it deals with specific group of queries. Based on these three model ensemble model was created and tested on SQUAD dataset. It outperforms the best Mnemonic Reader model.


RefNet: A Reference-aware Network for Background Based Conversation

Meng, Chuan, Ren, Pengjie, Chen, Zhumin, Monz, Christof, Ma, Jun, de Rijke, Maarten

arXiv.org Artificial Intelligence

Existing conversational systems tend to generate generic responses. Recently, Background Based Conversations (BBCs) have been introduced to address this issue. Here, the generated responses are grounded in some background information. The proposed methods for BBCs are able to generate more informative responses, they either cannot generate natural responses or have difficulty in locating the right background information. In this paper, we propose a Reference-aware Network (RefNet) to address the two issues. Unlike existing methods that generate responses token by token, RefNet incorporates a novel reference decoder that provides an alternative way to learn to directly cite a semantic unit (e.g., a span containing complete semantic information) from the background. Experimental results show that RefNet significantly outperforms state-of-the-art methods in terms of both automatic and human evaluations, indicating that RefNet can generate more appropriate and human-like responses.


Improving Background Based Conversation with Context-aware Knowledge Pre-selection

Zhang, Yangjun, Ren, Pengjie, de Rijke, Maarten

arXiv.org Artificial Intelligence

Background Based Conversations (BBCs) have been developed to make dialogue systems generate more informative and natural responses by leveraging background knowledge. Existing methods for BBCs can be grouped into two categories: extraction-based methods and generation-based methods. The former extract spans frombackground material as responses that are not necessarily natural. The latter generate responses thatare natural but not necessarily effective in leveraging background knowledge. In this paper, we focus on generation-based methods and propose a model, namely Context-aware Knowledge Pre-selection (CaKe), which introduces a pre-selection process that uses dynamic bi-directional attention to improve knowledge selection by using the utterance history context as prior information to select the most relevant background material. Experimental results show that our model is superior to current state-of-the-art baselines, indicating that it benefits from the pre-selection process, thus improving in-formativeness and fluency.


Understanding Dataset Design Choices for Multi-hop Reasoning

Chen, Jifan, Durrett, Greg

arXiv.org Artificial Intelligence

Learning multi-hop reasoning has been a key challenge for reading comprehension models, leading to the design of datasets that explicitly focus on it. Ideally, a model should not be able to perform well on a multi-hop question answering task without doing multi-hop reasoning. In this paper, we investigate two recently proposed datasets, WikiHop and HotpotQA. First, we explore sentence-factored models for these tasks; by design, these models cannot do multi-hop reasoning, but they are still able to solve a large number of examples in both datasets. Furthermore, we find spurious correlations in the unmasked version of WikiHop, which make it easy to achieve high performance considering only the questions and answers. Finally, we investigate one key difference between these datasets, namely span-based vs. multiple-choice formulations of the QA task. Multiple-choice versions of both datasets can be easily gamed, and two models we examine only marginally exceed a baseline in this setting. Overall, while these datasets are useful testbeds, high-performing models may not be learning as much multi-hop reasoning as previously thought.


QuAC : Question Answering in Context

Choi, Eunsol, He, He, Iyyer, Mohit, Yatskar, Mark, Yih, Wen-tau, Choi, Yejin, Liang, Percy, Zettlemoyer, Luke

arXiv.org Artificial Intelligence

We present QuAC, a dataset for Question Answering in Context that contains 14K information-seeking QA dialogs (100K questions in total). The dialogs involve two crowd workers: (1) a student who poses a sequence of freeform questions to learn as much as possible about a hidden Wikipedia text, and (2) a teacher who answers the questions by providing short excerpts from the text. QuAC introduces challenges not found in existing machine comprehension datasets: its questions are often more open-ended, unanswerable, or only meaningful within the dialog context, as we show in a detailed qualitative evaluation. We also report results for a number of reference models, including a recently state-of-the-art reading comprehension architecture extended to model dialog context. Our best model underperforms humans by 20 F1, suggesting that there is significant room for future work on this data. Dataset, baseline, and leaderboard available at http://quac.ai.


What Happened? Leveraging VerbNet to Predict the Effects of Actions in Procedural Text

Clark, Peter, Dalvi, Bhavana, Tandon, Niket

arXiv.org Artificial Intelligence

Our goal is to answer questions about paragraphs describing processes (e.g., photosynthesis). Texts of this genre are challenging because the effects of actions are often implicit (unstated), requiring background knowledge and inference to reason about the changing world states. To supply this knowledge, we leverage VerbNet to build a rulebase (called the Semantic Lexicon) of the preconditions and effects of actions, and use it along with commonsense knowledge of persistence to answer questions about change. Our evaluation shows that our system, ProComp, significantly outperforms two strong reading comprehension (RC) baselines. Our contributions are two-fold: the Semantic Lexicon rulebase itself, and a demonstration of how a simulation-based approach to machine reading can outperform RC methods that rely on surface cues alone. Since this work was performed, we have developed neural systems that outperform ProComp, described elsewhere (Dalvi et al., NAACL'18). However, the Semantic Lexicon remains a novel and potentially useful resource, and its integration with neural systems remains a currently unexplored opportunity for further improvements in machine reading about processes.