Question Answering
Chain of Reasoning for Visual Question Answering
Wu, Chenfei, Liu, Jinlai, Wang, Xiaojie, Dong, Xuan
Reasoning plays an essential role in Visual Question Answering (VQA). Multi-step and dynamic reasoning is often necessary for answering complex questions. For example, a question "What is placed next to the bus on the right of the picture?" talks about a compound object "bus on the right," which is generated by the relation bus, on the right of, picture . Furthermore, a new relation including this compound object sign, next to, bus on the right is then required to infer the answer. However, previous methods support either one-step or static reasoning, without updating relations or generating compound objects.
FQuAD: French Question Answering Dataset
d'Hoffschmidt, Martin, Vidal, Maxime, Belblidia, Wacim, Brendlé, Tom
Recent advances in the field of language modeling have improved state-of-the-art results on many Natural Language Processing tasks. Among them, the Machine Reading Comprehension task has made significant progress. However, most of the results are reported in English since labeled resources available in other languages, such as French, remain scarce. In the present work, we introduce the French Question Answering Dataset (FQuAD). FQuAD is French Native Reading Comprehension dataset that consists of 25,000+ questions on a set of Wikipedia articles. A baseline model is trained which achieves an F1 score of 88.0% and an exact match ratio of 77.9% on the test set. The dataset is made freely available at https://fquad.illuin.tech.
Additional Tips for Optimizing for Voice Search
And, today, we can do online searches simply by speaking to Alexa or Google Assistant. So, if you want to stay ahead of the competition, voice search optimization is something you must consider in your marketing strategies. Voice search optimization is crucial for any business because the possibilities of capturing local customers are endless. There are a few more things to keep in mind that can boost the number of times you appear on voice search results. The more data Google and other search engines have about your business, your website and your target audience, the more likely you are to appear on the Knowledge Graph Panel.
Google releases TyDi QA, a data set that aims to capture the uniqueness of languages
Google hopes to spur the development of AI capable of understanding the ways in which languages express different meanings. To this end, company researchers today detailed a data set -- TyDi QA, a question-answering data set covering 11 languages -- inspired by typological diversity, or the notion that different languages express meaning in structurally unique ways. TyDi QA is something of a complement to the English-language Natural Questions corpus Google released last year, and it attempts to capture t he idiosyncrasies and features of tongues like Japanese and Arabic. The researchers point out, for instance, that English changes words to indicate one object ("book") versus many ("books"), and that Arabic has a third form to indicate if there are two of something ("كتابان", kitaban) beyond just singular ("كتاب", kitab) or plural ("كتب", kutub). "Because we selected a set of languages that are typologically distant from each other for this corpus, we expect models performing well on this dataset to generalize across a large number of the languages in the world," wrote Google Research scientist Jonathan Clark in a blog post.
Message Passing for Query Answering over Knowledge Graphs
Logic-based systems for query answering over knowledge graphs return only answers that rely on information explicitly represented in the graph. To improve recall, recent works have proposed the use of embeddings to predict additional information like missing links, or labels. These embeddings enable scoring entities in the graph as the answer a query, without being fully dependent on the graph structure. In its simplest case, answering a query in such a setting requires predicting a link between two entities. However, link prediction is not sufficient to address complex queries that involve multiple entities and variables. To solve this task, we propose to apply a message passing mechanism to a graph representation of the query, where nodes correspond to variables and entities. This results in an embedding of the query, such that answering entities are close to it in the embedding space. The general formulation of our method allows it to encode a more diverse set of query types in comparison to previous work. We evaluate our method by answering queries that rely on edges not seen during training, obtaining competitive performance. In contrast with previous work, we show that our method can generalize from training for the single-hop, link prediction task, to answering queries with more complex structures. A qualitative analysis reveals that the learned embeddings successfully capture the notion of different entity types.
Why Overfitting is a Bad Idea and How to Avoid It (Part 1: Overfitting in general)
We want our AI models to be as accurate as they can be. That's one of the selling points of AI -- that we can encode the best version of our past knowledge and have an automated model infer and apply our judgement. How can we tell when the model is accurate enough to trust? More importantly how can we tell if our efforts to improve accuracy are actually making the model worse? This situation can happen through a training problem called overfitting.
IBM's Watson Center pitches AI for everyone, from chefs to engineers
At the IBM Watson Experience Center, digital and physical worlds meet in a futuristic-looking lounge overlooking San Francisco's Financial District. "Regardless of the industry you're in, there's likely an application for AI … even as a chef," said IBM's data and AI engagement lead Euniq Nebo, as he stood before a 32-foot digital screen displaying human-size images of various professionals. A chef on the screen stepped forward and came to life. Nebo spoke of questions facing a restaurant chef, such as which cutting-edge tools to invest in, or whether to incorporate local produce into a cuisine. But IBM is betting its AI can "extract the insights" from data to help its clients stay ahead of the curve, Nebo said.
How to Build an Assistant Using IBM Watson (Part 1 of 2)
In these two articles, I'll show you how to build an assistant that will control a mock smart home thermostat. These articles are intended to get you started with building assistants by creating something relevant in the real world. If you reach the end of the article and want to take a deeper dive or get help with a different chat framework such as Twilio, Drift, Lex, or something else, please leave me a comment. Below is the list of tools and services that we'll cover: I work in product management, and strictly speaking, a product manager doesn't need to possess tech skills to do their work. After all, that's what engineers are for.
Robust Explanations for Visual Question Answering
Patro, Badri N., Pate, Shivansh, Namboodiri, Vinay P.
In this paper, we propose a method to obtain robust explanations for visual question answering(VQA) that correlate well with the answers. Our model explains the answers obtained through a VQA model by providing visual and textual explanations. The main challenges that we address are i) Answers and textual explanations obtained by current methods are not well correlated and ii) Current methods for visual explanation do not focus on the right location for explaining the answer. We address both these challenges by using a collaborative correlated module which ensures that even if we do not train for noise based attacks, the enhanced correlation ensures that the right explanation and answer can be generated. We further show that this also aids in improving the generated visual and textual explanations. The use of the correlated module can be thought of as a robust method to verify if the answer and explanations are coherent. We evaluate this model using VQA-X dataset. We observe that the proposed method yields better textual and visual justification that supports the decision. We showcase the robustness of the model against a noise-based perturbation attack using corresponding visual and textual explanations. A detailed empirical analysis is shown. Here we provide source code link for our model \url{https://github.com/DelTA-Lab-IITK/CCM-WACV}.