"Questions are asked and answered every day. Question answering (QA) technology aims to deliver the same facility online. It goes further than the more familiar search based on keywords (as in Google, Yahoo, and other search engines), in attempting to recognize what a question expresses and to respond with an actual answer. This simplifies things for users in two ways. First, questions do not often translate into a simple list of keywords. ...Second, QA takes responsibility for providing answers, rather than a searchable list of links to potentially relevant documents (web pages), highlighted by snippets of text that show how the query matched the documents."
– from Bonnie Webber & Nick Webb. Question Answering. In The Handbook of Computational Linguistics and Natural Language Processing. Alexander Clark, Chris Fox, Shalom Lappin (Eds.). Wiley, 2010.
Viewing scenes and making sense of them is something people do effortlessly every day. Whether it's sussing out objects' colors or gauging their distances apart, it doesn't take much conscious effort to recognize items' attributes and apply knowledge to answer questions about them. That's patently untrue of most AI systems, which tend to reason rather poorly. But emerging techniques in visual recognition, language understanding, and symbolic program execution promise to imbue them with the ability to generalize to new examples, much like humans. Scientists at the MIT-IBM Watson AI Lab, a joint 10-year, $240 million partnership to propel scientific breakthroughs in machine learning, are perfecting an approach they say might overcome longstanding barriers in AI model design.
Researchers from the University of Maryland have figured out how to reliably create such questions through a human-computer collaboration, developing a dataset of more than 1,200 questions that, while easy for people to answer, stump the best computer answering systems today. The system that learns to master these questions will have a better understanding of language than any system currently in existence. The work is described in an article published in the 2019 issue of the journal Transactions of the Association for Computational Linguistics. "Most question-answering computer systems don't explain why they answer the way they do, but our work helps us see what computers actually understand," said Jordan Boyd-Graber, associate professor of computer science at UMD and senior author of the paper. "In addition, we have produced a dataset to test on computers that will reveal if a computer language system is actually reading and doing the same sorts of processing that humans are able to do."
BERT model has been successfully applied to open-domain QA tasks. However, previous work trains BERT by viewing passages corresponding to the same question as independent training instances, which may cause incomparable scores for answers from different passages. To tackle this issue, we propose a multi-passage BERT model to globally normalize answer scores across all passages of the same question, and this change enables our QA model find better answers by utilizing more passages. In addition, we find that splitting articles into passages with the length of 100 words by sliding window improves performance by 4%. By leveraging a passage ranker to select high-quality passages, multi-passage BERT gains additional 2%. Experiments on four standard benchmarks showed that our multi-passage BERT outperforms all state-of-the-art models on all benchmarks.
Embodied Question Answering (EQA) is a recently proposed task, where an agent is placed in a rich 3D environment and must act based solely on its egocentric input to answer a given question. The desired outcome is that the agent learns to combine capabilities such as scene understanding, navigation and language understanding in order to perform complex reasoning in the visual world. However, initial advancements combining standard vision and language methods with imitation and reinforcement learning algorithms have shown EQA might be too complex and challenging for these techniques. In order to investigate the feasibility of EQA-type tasks, we build the VideoNavQA dataset that contains pairs of questions and videos generated in the House3D environment. The goal of this dataset is to assess question-answering performance from nearly-ideal navigation paths, while considering a much more complete variety of questions than current instantiations of the EQA task. We investigate several models, adapted from popular VQA methods, on this new benchmark. This establishes an initial understanding of how well VQA-style methods can perform within this novel EQA paradigm.
To help advance question answering (QA) and create smarter assistants, Facebook AI is sharing the first large-scale data set, code, and baseline models for long-form QA, which requires machines to provide long, complex answers -- something that existing algorithms have not been challenged to do before. Current systems are focused on trivia-type questions, like whether jellyfish have a brain. Our data set goes further by requiring machines to elaborate with in-depth answers to open-ended questions, such as "How do jellyfish function without a brain?" Furthermore, our data set provides researchers with hundreds of thousands of examples to advance AI models that can synthesize information from multiple sources and provide explanations to complex questions across a wide range of topics. For truly intelligent assistants that can help us with myriad daily tasks, AI should be able to answer a wide variety of questions from people beyond straightforward, factual queries such as "Which artist sings this song?"
Open book question answering is a type of natural language based QA (NLQA) where questions are expected to be answered with respect to a given set of open book facts, and common knowledge about a topic. Recently a challenge involving such QA, OpenBookQA, has been proposed. Unlike most other NLQA tasks that focus on linguistic understanding, Open-BookQA requires deeper reasoning involving linguistic understanding as well as reasoning with common knowledge. In this paper we address QA with respect to the OpenBookQA dataset and combine state of the art language models with abductive information retrieval (IR), information gain based re-ranking, passage selection and weighted scoring to achieve 72.0% accuracy, an 11.6% improvement over the current state of the art.
To simplify the path toward enterprise AI, organizations are turning to IBM Watson Studio and Watson Machine Learning. Together with IBM Watson Machine Learning, IBM Watson Studio is a leading data science and machine learning platform built from the ground up for an AI-powered business. It helps enterprises simplify the process of experimentation to deployment, speed data exploration and model development and training, and scale data science operations across the lifecycle.
Before Siri and Alexa, there was Watson. Appearing as a contestant on "Jeopardy!" made IBM's Watson a household name. But since its debut -- and win -- in 2011, the computer has morphed into something else entirely: An artificial intelligence tool for business. The company opened up Watson in the cloud wars, making the technology available on competitors' clouds last month. Behind the Watson branding are career technologists making the tool work for business customers.
The increasing rate of information pollution on the Web requires novel solutions to tackle that. Question Answering (QA) interfaces are simplified and user-friendly interfaces to access information on the Web. However, similar to other AI applications, they are black boxes which do not manifest the details of the learning or reasoning steps for augmenting an answer. The Explainable Question Answering (XQA) system can alleviate the pain of information pollution where it provides transparency to the underlying computational model and exposes an interface enabling the end-user to access and validate provenance, validity, context, circulation, interpretation, and feedbacks of information. This position paper sheds light on the core concepts, expectations, and challenges in favor of the following questions (i) What is an XQA system?, (ii) Why do we need XQA?, (iii) When do we need XQA? (iv) How to represent the explanations? (iv) How to evaluate XQA systems?