"Questions are asked and answered every day. Question answering (QA) technology aims to deliver the same facility online. It goes further than the more familiar search based on keywords (as in Google, Yahoo, and other search engines), in attempting to recognize what a question expresses and to respond with an actual answer. This simplifies things for users in two ways. First, questions do not often translate into a simple list of keywords. ...Second, QA takes responsibility for providing answers, rather than a searchable list of links to potentially relevant documents (web pages), highlighted by snippets of text that show how the query matched the documents."
– from Bonnie Webber & Nick Webb. Question Answering. In The Handbook of Computational Linguistics and Natural Language Processing. Alexander Clark, Chris Fox, Shalom Lappin (Eds.). Wiley, 2010.
At the IBM Watson Experience Center, digital and physical worlds meet in a futuristic-looking lounge overlooking San Francisco's Financial District. "Regardless of the industry you're in, there's likely an application for AI … even as a chef," said IBM's data and AI engagement lead Euniq Nebo, as he stood before a 32-foot digital screen displaying human-size images of various professionals. A chef on the screen stepped forward and came to life. Nebo spoke of questions facing a restaurant chef, such as which cutting-edge tools to invest in, or whether to incorporate local produce into a cuisine. But IBM is betting its AI can "extract the insights" from data to help its clients stay ahead of the curve, Nebo said.
In these two articles, I'll show you how to build an assistant that will control a mock smart home thermostat. These articles are intended to get you started with building assistants by creating something relevant in the real world. If you reach the end of the article and want to take a deeper dive or get help with a different chat framework such as Twilio, Drift, Lex, or something else, please leave me a comment. Below is the list of tools and services that we'll cover: I work in product management, and strictly speaking, a product manager doesn't need to possess tech skills to do their work. After all, that's what engineers are for.
It's impossible to talk about artificial intelligence without mentioning IBM's Watson. A pioneer in cognitive computing, the American computer giant has found multiple health applications for Watson. Pascal Sempé, senior sales consultant for Watson Health Solutions in France, explained how Watson functions and what's at stake. ME e-mag: Could Watson ever replace doctors? Pascal Sempé: Watson is a tool that helps the doctor, certainly not one that tells the doctor what to do.
Visual Question answering is a challenging problem requiring a combination of concepts from Computer Vision and Natural Language Processing. Most existing approaches use a two streams strategy, computing image and question features that are consequently merged using a variety of techniques. Nonetheless, very few rely on higher level image representations, which can capture semantic and spatial relationships. In this paper, we propose a novel graph-based approach for Visual Question Answering. Our method combines a graph learner module, which learns a question specific graph representation of the input image, with the recent concept of graph convolutions, aiming to learn image representations that capture question specific interactions.
Query expansion is a long-studied approach for improving retrieval effectiveness by enhancing the userâ s original query with additional related terms. Current algorithms for automatic query expansion have been shown to consistently improve retrieval accuracy on average, but are highly unstable and have bad worst-case performance for individual queries. We introduce a novel risk framework that formulates query model estimation as a constrained metric labeling problem on a graph of term relations. Themodel combines assignment costs based on a baseline feedback algorithm, edge weights based on term similarity, and simple constraints to enforce aspect balance, aspect coverage, and term centrality. Results across multiple standard test collections show consistent and dramatic reductions in the number and magnitude of expansion failures, while retaining the strong positive gains of the baseline algorithm.
Visual Question Answering (VQA) deep-learning systems tend to capture superficial statistical correlations in the training data because of strong language priors and fail to generalize to test data with a significantly different question-answer (QA) distribution. To address this issue, we introduce a self-critical training objective that ensures that visual explanations of correct answers match the most influential image regions more than other competitive answer candidates. The influential regions are either determined from human visual/textual explanations or automatically from just significant words in the question and answer. We evaluate our approach on the VQA generalization task using the VQA-CP dataset, achieving a new state-of-the-art i.e. 49.5\% using textual explanations and 48.5\% using automatically Papers published at the Neural Information Processing Systems Conference.
Visual Question Answering (VQA) is a notoriously challenging problem because it involves various heterogeneous tasks defined by questions within a unified framework. Learning specialized models for individual types of tasks is intuitively attracting but surprisingly difficult; it is not straightforward to outperform naive independent ensemble approach. We present a principled algorithm to learn specialized models with knowledge distillation under a multiple choice learning (MCL) framework, where training examples are assigned dynamically to a subset of models for updating network parameters. The assigned and non-assigned models are learned to predict ground-truth answers and imitate their own base models before specialization, respectively. Our approach alleviates the limitation of data deficiency in existing MCL frameworks, and allows each model to learn its own specialized expertise without forgetting general knowledge.
We propose a method for automatically answering questions about images by bringing together recent advances from natural language processing and computer vision. We combine discrete reasoning with uncertain predictions by a multi-world approach that represents uncertainty about the perceived world in a bayesian framework. Our approach can handle human questions of high complexity about realistic scenes and replies with range of answer like counts, object classes, instances and lists of them. The system is directly trained from question-answer pairs. We establish a first benchmark for this task that can be seen as a modern attempt at a visual turing test.
Our method aims at reasoning over natural language questions and visual images. Given a natural language question about an image, our model updates the question representation iteratively by selecting image regions relevant to the query and learns to give the correct answer. Our model contains several reasoning layers, exploiting complex visual relations in the visual question answering (VQA) task. The proposed network is end-to-end trainable through back-propagation, where its weights are initialized using pre-trained convolutional neural network (CNN) and gated recurrent unit (GRU). Our method is evaluated on challenging datasets of COCO-QA and VQA and yields state-of-the-art performance.