Question Answering
Can Open Domain Question Answering Systems Answer Visual Knowledge Questions?
Zhang, Jiawen, Mishra, Abhijit, S, Avinesh P. V., Patwardhan, Siddharth, Agarwal, Sachin
The task of Outside Knowledge Visual Question Answering (OKVQA) requires an automatic system to answer natural language questions about pictures and images using external knowledge. We observe that many visual questions, which contain deictic referential phrases referring to entities in the image, can be rewritten as "non-grounded" questions and can be answered by existing text-based question answering systems. This allows for the reuse of existing text-based Open Domain Question Answering (QA) Systems for visual question answering. In this work, we propose a potentially data-efficient approach that reuses existing systems for (a) image analysis, (b) question rewriting, and (c) text-based question answering to answer such visual questions. Given an image and a question pertaining to that image (a visual question), we first extract the entities present in the image using pre-trained object and scene classifiers. Using these detected entities, the visual questions can be rewritten so as to be answerable by open domain QA systems. We explore two rewriting strategies: (1) an unsupervised method using BERT for masking and rewriting, and (2) a weakly supervised approach that combines adaptive rewriting and reinforcement learning techniques to use the implicit feedback from the QA system. We test our strategies on the publicly available OKVQA dataset and obtain a competitive performance with state-of-the-art models while using only 10% of the training data.
Calvanese
In order to meet usability requirements, most logic-based applications provide explanation facilities for reasoning services. This holds also for DLs, where research has focused on the explanation of both TBox reasoning and, more recently, query answering. Besides explaining the presence of a tuple in a query answer, it is important to explain also why a given tuple is missing. We address the latter problem for (conjunctive) query answering over DL-Lite ontologies, by adopting abductive reasoning, that is, we look for additions to the ABox that force a given tuple to be in the result. As reasoning taskswe consider existence and recognition of an explanation, and relevance and necessity of a certain assertion for an explanation. We characterize the computational complexity of these problems for subset minimal and cardinality minimal explanations.
Li
With Community Question Answering (CQA) evolving into a quite popular method for information seeking and providing, it also becomes a target for spammers to disseminate promotion campaigns. Although there are a number of quality estimation efforts on the CQA platform, most of these works focus on identifying and reducing low-quality answers, which are mostly generated by impatient or inexperienced answerers. However, a large number of promotion answers appear to provide high-quality information to cheat CQA users in future interactions. Therefore, most existing quality estimation works in CQA may fail to detect these specially designed answers or question-answer pairs. In contrast to these works, we focus on the promotion channels of spammers, which include (shortened) URLs, telephone numbers and social media accounts. Spammers rely on these channels to connect to users to achieve promotion goals so they are irreplaceable for spamming activities. We propose a propagation algorithm to diffuse promotion intents on an "answerer-channel" bipartite graph and detect possible spamming activities. A supervised learning framework is also proposed to identify whether a QA pair is spam based on propagated promotion intents. Experimental results based on more than 6 million entries from a popular Chinese CQA portal show that our approach outperforms a number of existing quality estimation methods for detecting promotion campaigns on both the answer level and QA pair level.
Singla
How should we gather information in a network, where each node's visibility is limited to its local neighborhood? This problem arises in numerous real-world applications, such as surveying and task routing in social networks, team formation in collaborative networks and experimental design with dependency constraints. Often the informativeness of a set of nodes can be quantified via a submodular utility function. Existing approaches for submodular optimization, however, require that the set of all nodes that can be selected is known ahead of time, which is often unrealistic. In contrast, we propose a novel model where we start our exploration from an initial node, and new nodes become visible and available for selection only once one of their neighbors has been chosen. We then present a general algorithm \elgreedy for this problem, and provide theoretical bounds on its performance dependent on structural properties of the underlying network. We evaluate our methodology on various simulated problem instances as well as on data collected from social question answering system deployed within a large enterprise.
Sun
A Knowledge Graph (KG), popularly used in both industry and academia, is an effective representation of knowledge. It consists of a collection of knowledge elements, each of which in turn is extracted from the web or other sources. Information extractors that use natural language processing techniques or other complex algorithms are usually noisy. That is, the vast number of knowledge elements extracted from the web may not only be associated with different confidence values but may also be inconsistent with each other. Many applications such as question answering systems that are built on top of large-scale KGs are required to reason efficiently about these confidence values and inconsistencies.
Burghardt
Crowdsourcing can identify high-quality solutions to problems; however, individual decisions are constrained by cognitive biases. We investigate some of these biases in an experimental model of a question-answering system. We observe a strong position bias in favor of answers appearing earlier in a list of choices. This effect is enhanced by three cognitive factors: the attention an answer receives, its perceived popularity, and cognitive load, measured by the number of choices a user has to process. While separately weak, these effects synergistically amplify position bias and decouple user choices of best answers from their intrinsic quality. We end our paper by discussing the novel ways we can apply these findings to substantially improve how high-quality answers are found in question-answering systems.
JaQuAD: Japanese Question Answering Dataset for Machine Reading Comprehension
So, ByungHoon, Byun, Kyuhong, Kang, Kyungwon, Cho, Seongjin
Question Answering (QA) is a task in which a machine understands a given document and a question to find an answer. Despite impressive progress in the NLP area, QA is still a challenging problem, especially for non-English languages due to the lack of annotated datasets. In this paper, we present the Japanese Question Answering Dataset, JaQuAD, which is annotated by humans. JaQuAD consists of 39,696 extractive question-answer pairs on Japanese Wikipedia articles. We finetuned a baseline model which achieves 78.92% for F1 score and 63.38% for EM on test set. The dataset and our experiments are available at https://github.com/SkelterLabsInc/JaQuAD.
How IBM's Watson Went From the Future of Health Care to Sold Off for Parts
Most likely, you're familiar with Watson from the IBM computer system's appearance on Jeopardy! in 2011, when it beat former champions Ken Jennings and Brad Rudder. Watson Health was supposed to change health care in a lot of important ways, by providing insight to oncologists about care for cancer patients, delivering insight to pharmaceutical companies about drug development, helping to match patients with clinical trials, and more. It sounded revolutionary, but it never really worked. Recently, Watson Health was, essentially, sold for parts: Francisco Partners, a private equity firm, bought some of Watson's data and analytics products for what Bloomberg News said was more than $1 billion. On Friday's episode of What Next: TBD, I spoke with Casey Ross, technology correspondent for Stat News, who has been covering Watson Health for years, about how Watson went from being the future of health care to being sold for scraps.
The Downfall of One of the World's Biggest Brains
Ten years ago, IBM made a gamble. Through a monumental advertising and PR campaign, it promised that its AI technology–Watson–would transform the health care industry as we know it. A decade and billions of dollars later, Watson Health is being sold for parts. What went wrong with IBM's "moonshot?" And what does Watson's failure tell us about the promise of AI for health care?