Goto

Collaborating Authors

 Question Answering


Generative Relation Linking for Question Answering over Knowledge Bases

arXiv.org Artificial Intelligence

Relation linking is essential to enable question answering over knowledge bases. Although there are various efforts to improve relation linking performance, the current state-of-the-art methods do not achieve optimal results, therefore, negatively impacting the overall end-to-end question answering performance. In this work, we propose a novel approach for relation linking framing it as a generative problem facilitating the use of pre-trained sequence-to-sequence models. We extend such sequence-to-sequence models with the idea of infusing structured data from the target knowledge base, primarily to enable these models to handle the nuances of the knowledge base. Moreover, we train the model with the aim to generate a structured output consisting of a list of argument-relation pairs, enabling a knowledge validation step. We compared our method against the existing relation linking systems on four different datasets derived from DBpedia and Wikidata. Our method reports large improvements over the state-of-the-art while using a much simpler model that can be easily adapted to different knowledge bases.


Evaluation Metrics: Assessing the quality of NLG outputs

#artificialintelligence

In the field of machine learning, as in the most unrelated fields as well, we need some sort of evaluation. You can think of a student taking an exam, a car in a crash test, a web server on load test, and performance evaluation of a model in AI. Evaluation methods differ among these fields and evolution criteria designed marginally. This procedure is needed mainly to assess the quality of outputs of a model, and also to compare them among different models or with different setups, etc. Natural Language Generation (NLG), a field in Natural Language Processing (NLP), is an applied subfield of artificial intelligence, where the goal is to produce a textual output. It has a vast amount of subtasks like machine translation (MT), question answering (QA), summarization, question generation (QG), etc. Here, the discussion is around the performance of the models whose outputs are text.


Mounting Video Metadata on Transformer-based Language Model for Open-ended Video Question Answering

arXiv.org Artificial Intelligence

Video question answering has recently received a lot of attention from multimodal video researchers. Most video question answering datasets are usually in the form of multiple-choice. But, the model for the multiple-choice task does not infer the answer. Rather it compares the answer candidates for picking the correct answer. Furthermore, it makes it difficult to extend to other tasks. In this paper, we challenge the existing multiple-choice video question answering by changing it to open-ended video question answering. To tackle open-ended question answering, we use the pretrained GPT2 model. The model is fine-tuned with video inputs and subtitles. An ablation study is performed by changing the existing DramaQA dataset to an open-ended question answering, and it shows that performance can be improved using video metadata.


Artifical Intelligence Has Revolutionized Our Life Over The Past Decades:

#artificialintelligence

Artificial Intelligence refers to the ability of any machine or computer to mimic human capabilities such as recognizing objects,making decisions, and solving problems,etc. The past decade has witnessed the great rise of Artificial Intelligence. The technology has made an impact in almost every field out there. The two major reasons for the rapid growth of AI in this decade are: data and compute. IBM Watson, a natural language question-answering computer, competes on Jeopardy and defeats two former champions.. Watson is a significant leap of a machine's ability to understand the context in human language.


Q-Pain: A Question Answering Dataset to Measure Social Bias in Pain Management

arXiv.org Artificial Intelligence

Recent advances in Natural Language Processing (NLP), and specifically automated Question Answering (QA) systems, have demonstrated both impressive linguistic fluency and a pernicious tendency to reflect social biases. In this study, we introduce Q-Pain, a dataset for assessing bias in medical QA in the context of pain management, one of the most challenging forms of clinical decision-making. Along with the dataset, we propose a new, rigorous framework, including a sample experimental design, to measure the potential biases present when making treatment decisions. We demonstrate its use by assessing two reference Question-Answering systems, GPT-2 and GPT-3, and find statistically significant differences in treatment between intersectional race-gender subgroups, thus reaffirming the risks posed by AI in medical settings, and the need for datasets like ours to ensure safety before medical AI applications are deployed.


MuSiQue: Multi-hop Questions via Single-hop Question Composition

arXiv.org Artificial Intelligence

To build challenging multi-hop question answering datasets, we propose a bottom-up semi-automatic process of constructing multi-hop question via composition of single-hop questions. Constructing multi-hop questions as composition of single-hop questions allows us to exercise greater control over the quality of the resulting multi-hop questions. This process allows building a dataset with (i) connected reasoning where each step needs the answer from a previous step; (ii) minimal train-test leakage by eliminating even partial overlap of reasoning steps; (iii) variable number of hops and composition structures; and (iv) contrasting unanswerable questions by modifying the context. We use this process to construct a new multihop QA dataset: MuSiQue-Ans with ~25K 2-4 hop questions using seed questions from 5 existing single-hop datasets. Our experiments demonstrate that MuSique is challenging for state-of-the-art QA models (e.g., human-machine gap of $~$30 F1 pts), significantly harder than existing datasets (2x human-machine gap), and substantially less cheatable (e.g., a single-hop model is worse by 30 F1 pts). We also build an even more challenging dataset, MuSiQue-Full, consisting of answerable and unanswerable contrast question pairs, where model performance drops further by 13+ F1 pts. For data and code, see \url{https://github.com/stonybrooknlp/musique}.


Add Voice Search to Improve e-Commerce Engagement

#artificialintelligence

In our high-speed, multi-tasking culture, fundamental shifts are happening in the way we interact with search technology. Today, mobile devices are the source of 60% of all online searchers. As voice-to-text technology has improved on smartphones and other devices, so has adoption of voice-based commands. In their research recap "Prepare for the Voice Revolution" PWC reports the majority of survey respondents said searching online with voice assistants--like Apple's Siri and Amazon's Alexa--is easier, more convenient, and faster than speaking to a human or texting on a phone. Although younger mobile-first consumers are driving adoption, PWC states, they aren't using the tech as frequently as their 55 counterparts.


An Online Question Answering System based on Sub-graph Searching

arXiv.org Artificial Intelligence

Knowledge graphs (KGs) have been widely used for question answering (QA) applications, especially the entity based QA. However, searching an-swers from an entire large-scale knowledge graph is very time-consuming and it is hard to meet the speed need of real online QA systems. In this pa-per, we design a sub-graph searching mechanism to solve this problem by creating sub-graph index, and each answer generation step is restricted in the sub-graph level. We use this mechanism into a real online QA chat system, and it can bring obvious improvement on question coverage by well answer-ing entity based questions, and it can be with a very high speed, which en-sures the user experience of online QA.


QA Dataset Explosion: A Taxonomy of NLP Resources for Question Answering and Reading Comprehension

arXiv.org Artificial Intelligence

Alongside huge volumes of research on deep learning models in NLP in the recent years, there has been also much work on benchmark datasets needed to track modeling progress. Question answering and reading comprehension have been particularly prolific in this regard, with over 80 new datasets appearing in the past two years. This study is the largest survey of the field to date. We provide an overview of the various formats and domains of the current resources, highlighting the current lacunae for future work. We further discuss the current classifications of ``reasoning types" in question answering and propose a new taxonomy. We also discuss the implications of over-focusing on English, and survey the current monolingual resources for other languages and multilingual resources. The study is aimed at both practitioners looking for pointers to the wealth of existing data, and at researchers working on new resources.


Now You Can Use Any Language With IBM Watson Assistant

#artificialintelligence

When venturing into the field of chatbots and Conversational AI, usually the process starts with a search of what frameworks are available. Invariably this leads you to one of the big cloud Chatbot service providers. Most probably you will end up using IBM Watson Assistant, Microsoft LUIS/Bot Framework, Google Dialog Flow etc. There are advantages…these environments offer easy entry in terms of cost and a low-code or no-code approach. However, one big impediment you often run into with these environments, is the lack of diversity when it comes to language options. This changed 17 June 2021 when IBM introduced the Universal language model.