"Questions are asked and answered every day. Question answering (QA) technology aims to deliver the same facility online. It goes further than the more familiar search based on keywords (as in Google, Yahoo, and other search engines), in attempting to recognize what a question expresses and to respond with an actual answer. This simplifies things for users in two ways. First, questions do not often translate into a simple list of keywords. ...Second, QA takes responsibility for providing answers, rather than a searchable list of links to potentially relevant documents (web pages), highlighted by snippets of text that show how the query matched the documents."
– from Bonnie Webber & Nick Webb. Question Answering. In The Handbook of Computational Linguistics and Natural Language Processing. Alexander Clark, Chris Fox, Shalom Lappin (Eds.). Wiley, 2010.
Question answering (QA) systems provide a way of querying the information available in various formats including, but not limited to, unstructured and structured data in natural languages. It constitutes a considerable part of conversational artificial intelligence (AI) which has led to the introduction of a special research topic on Conversational Question Answering (CQA), wherein a system is required to understand the given context and then engages in multi-turn QA to satisfy the user's information needs. Whilst the focus of most of the existing research work is subjected to single-turn QA, the field of multi-turn QA has recently grasped attention and prominence owing to the availability of large-scale, multi-turn QA datasets and the development of pre-trained language models. With a good amount of models and research papers adding to the literature every year recently, there is a dire need of arranging and presenting the related work in a unified manner to streamline future research. This survey, therefore, is an effort to present a comprehensive review of the state-of-the-art research trends of CQA primarily based on reviewed papers from 2016-2021.
Automatic math problem solving has recently attracted increasing attention as a long-standing AI benchmark. In this paper, we focus on solving geometric problems, which requires a comprehensive understanding of textual descriptions, visual diagrams, and theorem knowledge. However, the existing methods were highly dependent on handcraft rules and were merely evaluated on small-scale datasets. Therefore, we propose a Geometric Question Answering dataset GeoQA, containing 5,010 geometric problems with corresponding annotated programs, which illustrate the solving process of the given problems. Compared with another publicly available dataset GeoS, GeoQA is 25 times larger, in which the program annotations can provide a practical testbed for future research on explicit and explainable numerical reasoning.
Video Question Answering (VideoQA) is a challenging video understanding task since it requires a deep understanding of both question and video. Previous studies mainly focus on extracting sophisticated visual and language embeddings, fusing them by delicate hand-crafted networks.However, the relevance of different frames, objects, and modalities to the question are varied along with the time, which is ignored in most of existing methods. Lacking understanding of the the dynamic relationships and interactions among objects brings a great challenge to VideoQA task.To address this problem, we propose a novel Relation-aware Hierarchical Attention (RHA) framework to learn both the static and dynamic relations of the objects in videos. In particular, videos and questions are embedded by pre-trained models firstly to obtain the visual and textual features. Then a graph-based relation encoder is utilized to extract the static relationship between visual objects.To capture the dynamic changes of multimodal objects in different video frames, we consider the temporal, spatial, and semantic relations, and fuse the multimodal features by hierarchical attention mechanism to predict the answer. We conduct extensive experiments on a large scale VideoQA dataset, and the experimental results demonstrate that our RHA outperforms the state-of-the-art methods.
During the British summer, conversations about sport become almost ubiquitous. This year, however, one participant in those conversations was very different: IBM Watson, IBM's cognitive intelligence. The All England Lawn Tennis Club knew that 2016 would feature unusually fierce competition for attention, with the Tour de France and Euro 2016 taking place alongside Wimbledon. More than ever before, social media was going to be a vital tool in directing that conversation, and directing attention to SW19. Wimbledon's "Cognitive Command Centre" – powered by Watson's intelligence running on a hybrid, IBM-managed cloud - scanned social media for emerging news and trends.
Conversational agent (CA) systems have been applied to healthcare domain, but there is no such a system to answer consumers regarding DS use, although widespread use of DS. In this study, we develop the first CA system for DS use. Methods: Our CA system for DS use developed on the MindeMeld framework, consists of three components: question understanding, DS knowledge base, and answer generation. We collected and annotated 1509 questions to develop natural language understanding module (e.g., question type classifier, named entity recognizer) which was then integrated into MindMeld framework. CA then queries the DS knowledge base (i.e., iDISK) and generates answers using rule-based slot filling techniques. We evaluated algorithms of each component and the CA system as a whole. Results: CNN is the best question classifier with F1 score of 0.81, and CRF is the best named entity recognizer with F1 score of 0.87. The system achieves an overall accuracy of 81% and an average score of 1.82 with succ@3 score as 76.2% and succ@2 as 66% approximately. Conclusion: This study develops the first CA system for DS use using MindMeld framework and iDISK domain knowledge base.
A commonly observed problem with the state-of-the art abstractive summarization models is that the generated summaries can be factually inconsistent with the input documents. The fact that automatic summarization may produce plausible-sounding yet inaccurate summaries is a major concern that limits its wide application. In this paper we present an approach to address factual consistency in summarization. We first propose an efficient automatic evaluation metric to measure factual consistency; next, we propose a novel learning algorithm that maximizes the proposed metric during model training. Through extensive experiments, we confirm that our method is effective in improving factual consistency and even overall quality of the summaries, as judged by both automatic metrics and human evaluation.
This course focuses on using state-of-the-art Natural Language processing techniques to solve the problem of question generation in edtech. If we pick up any middle school textbook, at the end of every chapter we see assessment questions like MCQs, True/False questions, Fill-in-the-blanks, Match the following, etc. In this course, we will see how we can take any text content and generate these assessment questions using NLP techniques. This course will be a very practical use case of NLP where we put basic algorithms like word vectors (word2vec, Glove, etc) to recent advancements like BERT, openAI GPT-2, and T5 transformers to real-world use. We will use NLP libraries like Spacy, NLTK, AllenNLP, HuggingFace transformers, etc.
The search only for documents is outdated. Users who have already adopted a question-answering (QA) approach with their personal devices, e.g., those powered by Alexa, Google Assistant, Siri, etc., are also appreciating the advantages of using a "search engine" with the same approach in a business context. Doing so allows them to not only search for documents, but also obtain precise answers to specific questions. QA systems respond to questions that someone can ask in natural language. This technology is already widely adopted and now rapidly gaining importance in the business environment, where the most obvious added value of a conversational AI platform is improving the customer experience.
Visual question answering (VQA) is a task that combines both the techniques of computer vision and natural language processing. It requires models to answer a text-based question according to the information contained in a visual. In recent years, the research field of VQA has been expanded. Research that focuses on the VQA, examining the reasoning ability and VQA on scientific diagrams, has also been explored more. Meanwhile, more multimodal feature fusion mechanisms have been proposed. This paper will review and analyze existing datasets, metrics, and models proposed for the VQA task.
The world is going through a cybersecurity pandemic. No day passes without a hack or data theft being carried out, discovered, or begrudgingly announced. High-profile victims abound – from the PlayStation Network, hacked in 2011, to Dropbox's 2012 breach, to the 500-million-user data theft Yahoo! suffered in 2014, two years before going public about the hack. Those carrying out the attacks have honed their craft to create ever more sophisticated hacking tools. According to a recent study by security consultancy Juniper Research, cybercrime is expected to balloon into a $2.1 trillion (£1.7 trillion) industry by 2019.