gscan
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- (7 more...)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
- North America > United States > New York (0.04)
- (11 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.69)
* says " It will be of great help to the improvement of the generalization ability of the
We thank each reviewer for taking the time to thoughtfully comment on our work and we're glad that they recognize NLU tasks, such as teaching autonomous agents to perform tasks by demonstration. R2 wonders: don't these results just show R2 also points out some things that are unclear in the experiments section. It's true the models also perform well on the random split (A), which we left unsaid but will Finally, we thank R2 for pointing out 2 missing links in Figure 1, we will update them accordingly. C can shed light on this. GECA adds a lot of red squares to the training set.
Review for NeurIPS paper: A Benchmark for Systematic Generalization in Grounded Language Understanding
The paper proposes a new benchmark called gSCAN, for learning compositional and grounded natural language understanding. The argument is that to evaluate compositional generalization, situated language understanding (grounding) is necessary. It evaluates eight types of compositional generalization methods with the benchmark. The conclusion is that gSCAN can be used as a useful benchmark. Strength • A new benchmark dataset is created.
Improved Compositional Generalization by Generating Demonstrations for Meta-Learning
Spilsbury, Sam, Ilin, Alexander
Meta-learning and few-shot prompting are viable methods to induce certain types of compositional behaviour. However, these methods can be very sensitive to the choice of support examples used. Choosing good supports from the training data for a given test query is already a difficult problem, but in some cases solving this may not even be enough. We consider a grounded language learning problem (gSCAN) where good support examples for certain test splits might not even exist in the training data, or would be infeasible to search for. We design an agent which instead generates possible supports which are relevant to the test query and current state of the world, then uses these supports via meta-learning to solve the test query. We show substantially improved performance on a previously unsolved compositional behaviour split without a loss of performance on other splits. Further experiments show that in this case, searching for relevant demonstrations even with an oracle function is not sufficient to attain good performance when using meta-learning.
- North America > United States > Washington > King County > Seattle (0.14)
- North America > Dominican Republic (0.04)
- Europe > Finland (0.04)
- (13 more...)
When Can Transformers Ground and Compose: Insights from Compositional Generalization Benchmarks
Sikarwar, Ankur, Patel, Arkil, Goyal, Navin
Humans can reason compositionally whilst grounding language utterances to the real world. Recent benchmarks like ReaSCAN use navigation tasks grounded in a grid world to assess whether neural models exhibit similar capabilities. In this work, we present a simple transformer-based model that outperforms specialized architectures on ReaSCAN and a modified version of gSCAN. On analyzing the task, we find that identifying the target location in the grid world is the main challenge for the models. Furthermore, we show that a particular split in ReaSCAN, which tests depth generalization, is unfair. On an amended version of this split, we show that transformers can generalize to deeper input structures. Finally, we design a simpler grounded compositional generalization task, RefEx, to investigate how transformers reason compositionally. We show that a single self-attention layer with a single head generalizes to novel combinations of object attributes. Moreover, we derive a precise mathematical construction of the transformer's computations from the learned network. Overall, we provide valuable insights about the grounded compositional generalization task and the behaviour of transformers on it, which would be useful for researchers working in this area.
- North America > Dominican Republic (0.04)
- Asia > India (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (5 more...)
Think before you act: A simple baseline for compositional generalization
Heinze-Deml, Christina, Bouchacourt, Diane
Contrarily to humans who have the ability to recombine familiar expressions to create novel ones, modern neural networks struggle to do so. This has been emphasized recently with the introduction of the benchmark dataset "gSCAN" (Ruis et al. 2020), aiming to evaluate models' performance at compositional generalization in grounded language understanding. In this work, we challenge the gSCAN benchmark by proposing a simple model that achieves surprisingly good performance on two of the gSCAN test splits. Our model is based on the observation that, to succeed on gSCAN tasks, the agent must (i) identify the target object (think) before (ii) navigating to it successfully (act). Concretely, we propose an attention-inspired modification of the baseline model from (Ruis et al. 2020), together with an auxiliary loss, that takes into account the sequential nature of steps (i) and (ii). While two compositional tasks are trivially solved with our approach, we also find that the other tasks remain unsolved, validating the relevance of gSCAN as a benchmark for evaluating models' compositional abilities.
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
- (3 more...)
Systematic Generalization on gSCAN with Language Conditioned Embedding
Gao, Tong, Huang, Qi, Mooney, Raymond J.
Systematic Generalization refers to a learning algorithm's ability to extrapolate learned behavior to unseen situations that are distinct but semantically similar to its training data. As shown in recent work, state-of-the-art deep learning models fail dramatically even on tasks for which they are designed when the test set is systematically different from the training data. We hypothesize that explicitly modeling the relations between objects in their contexts while learning their representations will help achieve systematic generalization. Therefore, we propose a novel method that learns objects' contextualized embeddings with dynamic message passing conditioned on the input natural language and end-to-end trainable with other downstream deep learning modules. To our knowledge, this model is the first one that significantly outperforms the provided baseline and reaches state-of-the-art performance on grounded-SCAN (gSCAN), a grounded natural language navigation dataset designed to require systematic generalization in its test splits.
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
Compositional Networks Enable Systematic Generalization for Grounded Language Understanding
Kuo, Yen-Ling, Katz, Boris, Barbu, Andrei
Humans are remarkably flexible when understanding new sentences that include combinations of concepts they have never encountered before. Recent work has shown that while deep networks can mimic some human language abilities when presented with novel sentences, systematic variation uncovers the limitations in the language-understanding abilities of neural networks. We demonstrate that these limitations can be overcome by addressing the generalization challenges in a recently-released dataset, gSCAN, which explicitly measures how well a robotic agent is able to interpret novel ideas grounded in vision, e.g., novel pairings of adjectives and nouns. The key principle we employ is compositionality: that the compositional structure of networks should reflect the compositional structure of the problem domain they address, while allowing all other parameters and properties to be learned end-to-end with weak supervision. We build a general-purpose mechanism that enables robots to generalize their language understanding to compositional domains. Crucially, our base network has the same state-of-the-art performance as prior work, 97% execution accuracy, while at the same time generalizing its knowledge when prior work does not; for example, achieving 95% accuracy on novel adjective-noun compositions where previous work has 55% average accuracy. Robust language understanding without dramatic failures and without corner causes is critical to building safe and fair robots; we demonstrate the significant role that compositionality can play in achieving that goal.