Goto

Collaborating Authors

 lois


LOIS: Looking Out of Instance Semantics for Visual Question Answering

Zhang, Siyu, Chen, Yeming, Sun, Yaoru, Wang, Fang, Shi, Haibo, Wang, Haoran

arXiv.org Artificial Intelligence

Visual question answering (VQA) has been intensively studied as a multimodal task that requires effort in bridging vision and language to infer answers correctly. Recent attempts have developed various attention-based modules for solving VQA tasks. However, the performance of model inference is largely bottlenecked by visual processing for semantics understanding. Most existing detection methods rely on bounding boxes, remaining a serious challenge for VQA models to understand the causal nexus of object semantics in images and correctly infer contextual information. To this end, we propose a finer model framework without bounding boxes in this work, termed Looking Out of Instance Semantics (LOIS) to tackle this important issue. LOIS enables more fine-grained feature descriptions to produce visual facts. Furthermore, to overcome the label ambiguity caused by instance masks, two types of relation attention modules: 1) intra-modality and 2) inter-modality, are devised to infer the correct answers from the different multi-view features. Specifically, we implement a mutual relation attention module to model sophisticated and deeper visual semantic relations between instance objects and background information. In addition, our proposed attention model can further analyze salient image regions by focusing on important word-related questions. Experimental results on four benchmark VQA datasets prove that our proposed method has favorable performance in improving visual reasoning capability.


The relational processing limits of classic and contemporary neural network models of language processing

Puebla, Guillermo, Martin, Andrea E., Doumas, Leonidas A. A.

arXiv.org Artificial Intelligence

The ability of neural networks to capture relational knowledge is a matter of long-standing controversy. Recently, some researchers in the PDP side of the debate have argued that (1) classic PDP models can handle relational structure (Rogers & McClelland, 2008, 2014) and (2) the success of deep learning approaches to text processing suggests that structured representations are unnecessary to capture the gist of human language (Rabovsky et al., 2018). In the present study we tested the Story Gestalt model (St. John, 1992), a classic PDP model of text comprehension, and a Sequence-to-Sequence with Attention model (Bahdanau et al., 2015), a contemporary deep learning architecture for text processing. Both models were trained to answer questions about stories based on the thematic roles that several concepts played on the stories. In three critical test we varied the statistical structure of new stories while keeping their relational structure constant with respect to the training data. Each model was susceptible to each statistical structure manipulation to a different degree, with their performance failing below chance at least under one manipulation. We argue that the failures of both models are due to the fact that they cannotperform dynamic binding of independent roles and fillers. Ultimately, these results cast doubts onthe suitability of traditional neural networks models for explaining phenomena based on relational reasoning, including language processing.


Jeff Vandermeer on the delicious satire of 'Sourdough' by Robin Sloan

Los Angeles Times

In this day and age, under our current political conditions, you'd be forgiven for mistaking lightness for triteness, escape for escapism. There's a sense that our fictions should be of Earth-shattering import in the obvious ways, and this perhaps desensitizes us to other examples of subversion and narrative. It may also make us miss out on some great fiction about odd bread, an imaginary country and the processes behind making robot arms. All of which is to say that Robin Sloan's delightful new novel, "Sourdough," the follow-up to his runaway success "Mr. It is that rare thing: a satire that has a love of what it satirizes while also functioning as a modern fairy tale about, of all things, the magic of certain carbohydrates. For this to be a chemical rather than physical reaction, Sloan must display a sure and natural knowledge of high-tech culture and of bread culture (in both senses). His keen insight into both automatons and organic foods stems from his immersion in the San ...