Goto

Collaborating Authors

 Question Answering


2020 No-Code AI & Machine Learning Using IBM Watson AutoAI

#artificialintelligence

In this course I am going to introduce you to Watson Studio AutoAI by IBM. Artificial Intelligence (AI) and Machine Learning (ML) are two very hot topics nowadays. Experts claim that AI & ML are going to revolutionize the world. This course is designed for those who want to take a short cut to these technologies. Auto AI and Auto ML are new tools that provide methods and processes to make Artificial intelligence and Machine Learning available for non-experts.


What Gives the Answer Away? Question Answering Bias Analysis on Video QA Datasets

arXiv.org Machine Learning

Question answering biases in video QA datasets can mislead multimodal model to overfit to QA artifacts and jeopardize the model's ability to generalize. Understanding how strong these QA biases are and where they come from helps the community measure progress more accurately and provide researchers insights to debug their models. In this paper, we analyze QA biases in popular video question answering datasets and discover pretrained language models can answer 37-48% questions correctly without using any multimodal context information, far exceeding the 20% random guess baseline for 5-choose-1 multiple-choice questions. Our ablation study shows biases can come from annotators and type of questions. Specifically, annotators that have been seen during training are better predicted by the model and reasoning, abstract questions incur more biases than factual, direct questions. We also show empirically that using annotator-non-overlapping train-test splits can reduce QA biases for video QA datasets.


Learn How To Optimize Your Content for Voice Search

#artificialintelligence

Voice search is on the rise. We all have known this for ages but the voice search market may actually be growing faster than we expected. Throughout the last few years, smart speakers have been taking the market by storm. With the emergence of Amazon's Alexa (powered by Bing search), Google's Homepod, and Apples' Homepod (both powered by Google), voice search is naturally seeing unprecedented growth. And the growth is likely to surge in the next few years. Companies have already started to fine-tune their marketing and SEO strategies to accommodate this new technology.


Visual Question Answering as a Multi-Task Problem

arXiv.org Artificial Intelligence

Visual Question Answering(VQA) is a highly complex problem set, relying on many sub-problems to produce reasonable answers. In this paper, we present the hypothesis that Visual Question Answering should be viewed as a multi-task problem, and provide evidence to support this hypothesis. We demonstrate this by reformatting two commonly used Visual Question Answering datasets, COCO-QA and DAQUAR, into a multi-task format and train these reformatted datasets on two baseline networks, with one designed specifically to eliminate other possible causes for performance changes as a result of the reformatting. Though the networks demonstrated in this paper do not achieve strongly competitive results, we find that the multi-task approach to Visual Question Answering results in increases in performance of 5-9% against the single-task formatting, and that the networks reach convergence much faster than in the single-task case. Finally we discuss possible reasons for the observed difference in performance, and perform additional experiments which rule out causes not associated with the learning of the dataset as a multi-task problem.


Scene Graph Reasoning for Visual Question Answering

arXiv.org Machine Learning

Visual question answering is concerned with answering free-form questions about an image. Since it requires a deep linguistic understanding of the question and the ability to associate it with various objects that are present in the image, it is an ambitious task and requires techniques from both computer vision and natural language processing. We propose a novel method that approaches the task by performing context-driven, sequential reasoning based on the objects and their semantic and spatial relationships present in the scene. As a first step, we derive a scene graph which describes the objects in the image, as well as their attributes and their mutual relationships. A reinforcement agent then learns to autonomously navigate over the extracted scene graph to generate paths, which are then the basis for deriving answers. We conduct a first experimental study on the challenging GQA dataset with manually curated scene graphs, where our method almost reaches the level of human performance.


Latent Compositional Representations Improve Systematic Generalization in Grounded Question Answering

arXiv.org Artificial Intelligence

Answering questions that involve multi-step reasoning requires decomposing them and using the answers of intermediate steps to reach the final answer. However, state-of-the-art models in grounded question answering often do not explicitly perform decomposition, leading to difficulties in generalization to out-of-distribution examples. In this work, we propose a model that computes a representation and denotation for all question spans in a bottom-up, compositional manner using a CKY-style parser. Our model effectively induces latent trees, driven by end-to-end (the answer) supervision only. We show that this inductive bias towards tree structures dramatically improves systematic generalization to out-of-distribution examples compared to strong baselines on an arithmetic expressions benchmark as well as on CLOSURE, a dataset that focuses on systematic generalization of models for grounded question answering. On this challenging dataset, our model reaches an accuracy of 92.8%, significantly higher than prior models that almost perfectly solve the task on a random, in-distribution split.


Correction of Faulty Background Knowledge based on Condition Aware and Revise Transformer for Question Answering

arXiv.org Artificial Intelligence

The study of question answering has received increasing attention in recent years. This work focuses on providing an answer that compatible with both user intent and conditioning information corresponding to the question, such as delivery status and stock information in e-commerce. However, these conditions may be wrong or incomplete in real-world applications. Although existing question answering systems have considered the external information, such as categorical attributes and triples in knowledge base, they all assume that the external information is correct and complete. To alleviate the effect of defective condition values, this paper proposes condition aware and revise Transformer (CAR-Transformer). CAR-Transformer (1) revises each condition value based on the whole conversation and original conditions values, and (2) it encodes the revised conditions and utilizes the conditions embedding to select an answer. Experimental results on a real-world customer service dataset demonstrate that the CAR-Transformer can still select an appropriate reply when conditions corresponding to the question exist wrong or missing values, and substantially outperforms baseline models on automatic and human evaluations. The proposed CAR-Transformer can be extended to other NLP tasks which need to consider conditioning information.


Fine-tuning Multi-hop Question Answering with Hierarchical Graph Network

arXiv.org Artificial Intelligence

In this paper, we present a two stage model for multi-hop question answering. The first stage is a hierarchical graph network, which is used to reason over multi-hop question and is capable to capture different levels of granularity using the nature structure(i.e., paragraphs, questions, sentences and entities) of documents. The reasoning process is convert to node classify task(i.e., paragraph nodes and sentences nodes). The second stage is a language model fine-tuning task. In a word, stage one use graph neural network to select and concatenate support sentences as one paragraph, and stage two find the answer span in language model fine-tuning paradigm.


GLTR from MIT-IBM Watson AI Lab and HarvardNLP

#artificialintelligence

Obviously, GLTR is not perfect. Its main limitation is its limited scale. It won't be able to automatically detect large-scale abuse, only individual cases. Moreover, it requires at least an advanced knowledge of the language to know whether an uncommon word does make sense at a position. Our assumption is also limited in that it assumes a simple sampling scheme.


IBM Teams Up With Ad Council for AI-Powered Program

#artificialintelligence

Randi Stipes, CMO at IBM Watson Advertising, explained that "Call for Creative" is IBM's commitment "to help the advertising industry reemerge stronger from Covid-19." Through this initiative, the tech company ultimately wants to demonstrate how artificial intelligence can drive positive change when used in a purposeful way, geared toward helping the ad industry get back on its feet after the detrimental effects of Covid-19. IBM had debuted the award-winning Advertising Accelerator tools with Watson earlier this year and gave access to the Ad Council, which it is partnering with for this project. The Accelerator harnesses AI to "continuously learn and predict the optimal combination of creative elements to help brands deploy more effective digital campaigns based on key signals like consumer reaction, weather and time of day," a statement from the company said. Brands that leveraged Accelerator experienced a 25% increase in performance throughout a campaign along with a 10% lift in site visits after one week, the statement continued.