Not enough data to create a plot.
Try a different view from the menu above.
Chang, Maria
Answering Science Exam Questions Using Query Rewriting with Background Knowledge
Musa, Ryan, Wang, Xiaoyan, Fokoue, Achille, Mattei, Nicholas, Chang, Maria, Kapanipathi, Pavan, Makni, Bassem, Talamadupula, Kartik, Witbrock, Michael
Open-domain question answering (QA) is an important problem in AI and NLP that is emerging as a bellwether for progress on the generalizability of AI methods and techniques. Much of the progress in open-domain QA systems has been realized through advances in information retrieval methods and corpus construction. In this paper, we focus on the recently introduced ARC Challenge dataset, which contains 2,590 multiple choice questions authored for grade-school science exams. These questions are selected to be the most challenging for current QA systems, and current state of the art performance is only slightly better than random chance. We present a system that rewrites a given question into queries that are used to retrieve supporting text from a large corpus of science-related text. Our rewriter is able to incorporate background knowledge from ConceptNet and -- in tandem with a generic textual entailment system trained on SciTail that identifies support in the retrieved results -- outperforms several strong baselines on the end-to-end QA task despite only being trained to identify essential terms in the original source question. We use a generalizable decision methodology over the retrieved evidence and answer candidates to select the best answer. By combining query rewriting, background knowledge, and textual entailment our system is able to outperform several strong baselines on the ARC dataset.
A Systematic Classification of Knowledge, Reasoning, and Context within the ARC Dataset
Boratko, Michael, Padigela, Harshit, Mikkilineni, Divyendra, Yuvraj, Pritish, Das, Rajarshi, McCallum, Andrew, Chang, Maria, Fokoue-Nkoutche, Achille, Kapanipathi, Pavan, Mattei, Nicholas, Musa, Ryan, Talamadupula, Kartik, Witbrock, Michael
The recent work of Clark et al. introduces the AI2 Reasoning Challenge (ARC) and the associated ARC dataset that partitions open domain, complex science questions into an Easy Set and a Challenge Set. That paper includes an analysis of 100 questions with respect to the types of knowledge and reasoning required to answer them; however, it does not include clear definitions of these types, nor does it offer information about the quality of the labels. We propose a comprehensive set of definitions of knowledge and reasoning types necessary for answering the questions in the ARC dataset. Using ten annotators and a sophisticated annotation interface, we analyze the distribution of labels across the Challenge Set and statistics related to them. Additionally, we demonstrate that although naive information retrieval methods return sentences that are irrelevant to answering the query, sufficient supporting text is often present in the (ARC) corpus. Evaluating with human-selected relevant sentences improves the performance of a neural machine comprehension model by 42 points.