Question Answering
Beyond I.I.D.: Three Levels of Generalization for Question Answering on Knowledge Bases
Gu, Yu, Kase, Sue, Vanni, Michelle, Sadler, Brian, Liang, Percy, Yan, Xifeng, Su, Yu
Existing studies on question answering on knowledge bases (KBQA) mainly operate with the standard i.i.d assumption, i.e., training distribution over questions is the same as the test distribution. However, i.i.d may be neither reasonably achievable nor desirable on large-scale KBs because 1) true user distribution is hard to capture and 2) randomly sample training examples from the enormous space would be highly data-inefficient. Instead, we suggest that KBQA models should have three levels of built-in generalization: i.i.d, compositional, and zero-shot. To facilitate the development of KBQA models with stronger generalization, we construct and release a new large-scale, high-quality dataset with 64,331 questions, GrailQA, and provide evaluation settings for all three levels of generalization. In addition, we propose a novel BERT-based KBQA model. The combination of our dataset and model enables us to thoroughly examine and demonstrate, for the first time, the key role of pre-trained contextual embeddings like BERT in the generalization of KBQA.
IBM creates knowledgeable NLP system and adds AI governance capabilities to Watson
IBM has unveiled a slew of announcements designed to help businesses scale their use of AI. The company also announced the rollout of new capabilities for its Watson platform. IBM researchers have built a hybrid question-answering system called Neuro-Symbolic-QA (NSQA) that for the first time uses neurosymbolic AI to allow an AI system to offer "and"/ "or" to its recommendations. This will ultimately position the system to perform better in real-world situations, IBM said. "This enhanced reasoning capability comes as a result of an entirely new foundational AI method created by IBM researchers called Logical Neural Networks (LNN), IBM said. LNNs are a modification of today's neural networks so that they become equivalent to a set of logic statements, but they also retain the original learning capability of a neural network, the company explained in a blog post. QA is designed to meet the significant challenges in language-based AI, in particular the fact that the training of NLP ...
IBM announces new AI language, explainability, and automation services
During IBM's virtual AI Summit this week, the company announced updates across its Watson family of products in the areas of language, explainability, and workplace automation. A new feature called Reading Comprehension surfaces answers from databases of enterprise documents in response to natural language questions, assigning a confidence score to each response. A novel module in Watson Assistant called FAQ Extraction automatically generates question-and-answer documents. And AI Factsheets automatically captures key facts on a machine learning model's performance and generates reports to "foster transparency and ensure compliance." According to IBM, Reading Comprehension, which was built atop a top-performing question-answering system from IBM Research, is intended to help identify more precise answers in response to queries referring to business documents.
Getting AI to Reason: Using Neuro-Symbolic AI for Knowledge-Based Question Answering
Language is what makes us human. Asking questions is how we learn. Building on the foundations of deep learning and symbolic AI, we have developed technology that can answer complex questions with minimal domain-specific training. Initial results are very encouraging โ the system outperforms current state-of-the-art techniques on two prominent datasets with no need for specialized end-to-end training. As this technology matures, it will be possible to use it for better customer support, business intelligence, medical informatics, advanced discovery, and much more.
Airbus AI Introduces Natural Language QA System for Flight Crews
Airbus AI researchers have developed a system that uses natural language understanding to improve question answering (QA) performance when flight crews search for aircraft operating information. The aerospace industry relies on technical documents such as Aircraft Operating Manuals (AOM), Aircraft Operating Instructions and particularly Flight Crew Operating Manuals (FCOM) to guide flight crews on aircraft operations under normal, abnormal, and emergency conditions. FCOMs are issued by aircraft manufacturers and cover system descriptions, procedures, techniques, and performance data. They are the references used to develop standard operating procedures to improve safety and efficiency. Most government aviation administrations have authorized the use of tablet computers by commercial carrier pilots and flight crews to access FCOM information. The Airbus AI researchers note however that existing electronic flight bag (EFB) systems used for this purpose are in practice little more than pdf viewers with keyword search functionality.
Question Answering over Knowledge Bases by Leveraging Semantic Parsing and Neuro-Symbolic Reasoning
Kapanipathi, Pavan, Abdelaziz, Ibrahim, Ravishankar, Srinivas, Roukos, Salim, Gray, Alexander, Astudillo, Ramon, Chang, Maria, Cornelio, Cristina, Dana, Saswati, Fokoue, Achille, Garg, Dinesh, Gliozzo, Alfio, Gurajada, Sairam, Karanam, Hima, Khan, Naweed, Khandelwal, Dinesh, Lee, Young-Suk, Li, Yunyao, Luus, Francois, Makondo, Ndivhuwo, Mihindukulasooriya, Nandana, Naseem, Tahira, Neelam, Sumit, Popa, Lucian, Reddy, Revanth, Riegel, Ryan, Rossiello, Gaetano, Sharma, Udit, Bhargav, G P Shrivatsa, Yu, Mo
Knowledge base question answering (KBQA) is an important task in Natural Language Processing. Existing approaches face significant challenges including complex question understanding, necessity for reasoning, and lack of large training datasets. In this work, we propose a semantic parsing and reasoning-based Neuro-Symbolic Question Answering(NSQA) system, that leverages (1) Abstract Meaning Representation (AMR) parses for task-independent question under-standing; (2) a novel path-based approach to transform AMR parses into candidate logical queries that are aligned to the KB; (3) a neuro-symbolic reasoner called Logical Neural Net-work (LNN) that executes logical queries and reasons over KB facts to provide an answer; (4) system of systems approach,which integrates multiple, reusable modules that are trained specifically for their individual tasks (e.g. semantic parsing,entity linking, and relationship linking) and do not require end-to-end training data. NSQA achieves state-of-the-art performance on QALD-9 and LC-QuAD 1.0. NSQA's novelty lies in its modular neuro-symbolic architecture and its task-general approach to interpreting natural language questions.
End-to-End QA on COVID-19: Domain Adaptation with Synthetic Training
Reddy, Revanth Gangi, Iyer, Bhavani, Sultan, Md Arafat, Zhang, Rong, Sil, Avi, Castelli, Vittorio, Florian, Radu, Roukos, Salim
End-to-end question answering (QA) requires both information retrieval (IR) over a large document collection and machine reading comprehension (MRC) on the retrieved passages. Recent work has successfully trained neural IR systems using only supervised question answering (QA) examples from open-domain datasets. However, despite impressive performance on Wikipedia, neural IR lags behind traditional term matching approaches such as BM25 in more specific and specialized target domains such as COVID-19. Furthermore, given little or no labeled data, effective adaptation of QA systems can also be challenging in such target domains. In this work, we explore the application of synthetically generated QA examples to improve performance on closed-domain retrieval and MRC. We combine our neural IR and MRC systems and show significant improvements in end-to-end QA on the CORD-19 collection over a state-of-the-art open-domain QA baseline.
Answering Ambiguous Questions through Generative Evidence Fusion and Round-Trip Prediction
Gao, Yifan, Zhu, Henghui, Ng, Patrick, Santos, Cicero Nogueira dos, Wang, Zhiguo, Nan, Feng, Zhang, Dejiao, Nallapati, Ramesh, Arnold, Andrew O., Xiang, Bing
In open-domain question answering, questions are highly likely to be ambiguous because users may not know the scope of relevant topics when formulating them. Therefore, a system needs to find every possible interpretation of the question, and propose a set of disambiguated question-answer pairs. In this paper, we present a model that aggregates and combines evidence from multiple passages to generate question-answer pairs. Particularly, our model reads a large number of passages to find as many interpretations as possible. In addition, we propose a novel round-trip prediction approach to generate additional interpretations that our model fails to find in the first pass, and then verify and filter out the incorrect question-answer pairs to arrive at the final disambiguated output. On the recently introduced AmbigQA open-domain question answering dataset, our model, named Refuel, achieves a new state-of-the-art, outperforming the previous best model by a large margin. We also conduct comprehensive analyses to validate the effectiveness of our proposed round-trip prediction.
AGenT Zero: Zero-shot Automatic Multiple-Choice Question Generation for Skill Assessments
Li, Eric, Su, Jingyi, Sheng, Hao, Wai, Lawrence
Multiple-choice questions (MCQs) offer the most promising avenue for skill evaluation in the era of virtual education and job recruiting, where traditional performance-based alternatives such as projects and essays have become less viable, and grading resources are constrained. The automated generation of MCQs would allow assessment creation at scale. Recent advances in natural language processing have given rise to many complex question generation methods. However, the few methods that produce deployable results in specific domains require a large amount of domain-specific training data that can be very costly to acquire. Our work provides an initial foray into MCQ generation under high data-acquisition cost scenarios by strategically emphasizing paraphrasing the question context (compared to the task). In addition to maintaining semantic similarity between the question-answer pairs, our pipeline, which we call AGenT Zero, consists of only pre-trained models and requires no fine-tuning, minimizing data acquisition costs for question generation. AGenT Zero successfully outperforms other pre-trained methods in fluency and semantic similarity. Additionally, with some small changes, our assessment pipeline can be generalized to a broader question and answer space, including short answer or fill in the blank questions.
Zero-Shot Visual Slot Filling as Question Answering
This paper presents a new approach to visual zero-shot slot filling. The approach extends previous approaches by reformulating the slot filling task as Question Answering. Slot tags are converted to rich natural language questions that capture the semantics of visual information and lexical text on the GUI screen. These questions are paired with the user's utterance and slots are extracted from the utterance using a state-of-the-art ALBERT-based Question Answering system trained on the Stanford Question Answering dataset (SQuaD2). An approach to further refine the model with multi-task training is presented. The multi-task approach facilitates the incorporation of a large number of successive refinements and transfer learning across similar tasks. A new Visual Slot dataset and a visual extension of the popular ATIS dataset is introduced to support research and experimentation on visual slot filling. Results show F1 scores between 0.52 and 0.60 on the Visual Slot and ATIS datasets with no training data (zero-shot).