generalise
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
- Information Technology > Artificial Intelligence > Natural Language (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
- North America > United States > Colorado (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States > California > San Diego County > San Diego (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
Evaluating alignment between humans and neural network representations in image-based learning tasks
Humans represent scenes and objects in rich feature spaces, carrying information that allows us to generalise about category memberships and abstract functions with few examples. What determines whether a neural network model generalises like a human? We tested how well the representations of $86$ pretrained neural network models mapped to human learning trajectories across two tasks where humans had to learn continuous relationships and categories of natural images. In these tasks, both human participants and neural networks successfully identified the relevant stimulus features within a few trials, demonstrating effective generalisation. We found that while training dataset size was a core determinant of alignment with human choices, contrastive training with multi-modal data (text and imagery) was a common feature of currently publicly available models that predicted human generalisation. Intrinsic dimensionality of representations had different effects on alignment for different model types. Lastly, we tested three sets of human-aligned representations and found no consistent improvements in predictive accuracy compared to the baselines. In conclusion, pretrained neural networks can serve to extract representations for cognitive models, as they appear to capture some fundamental aspects of cognition that are transferable across tasks. Both our paradigms and modelling approach offer a novel way to quantify alignment between neural networks and humans and extend cognitive science into more naturalistic domains.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States > California > San Diego County > San Diego (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (4 more...)
TRA: Better Length Generalisation with Threshold Relative Attention
Opper, Mattia, Fernandez, Roland, Smolensky, Paul, Gao, Jianfeng
We test whether these limitations can be explained through two key failures of the self-attention mechanism. The first is the inability to fully remove irrelevant information. The second is tied to position, even if the dot product between a key and query is highly negative (i.e. an irrelevant key) learned positional biases may unintentionally up-weight such information - dangerous when distances become out of distribution. Put together, these two failure cases lead to compounding generalisation difficulties. We test whether they can be mitigated through the combination of a) selective sparsity - completely removing irrelevant keys from the attention softmax and b) contextualised relative distance - distance is only considered as between the query and the keys that matter. We show how refactoring the attention mechanism with these two mitigations in place can substantially improve the generalisation capabilities of decoder only transformers.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- Asia > Middle East > Jordan (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- (2 more...)
TrajAware: Graph Cross-Attention and Trajectory-Aware for Generalisable VANETs under Partial Observations
Fu, Xiaolu, Bao, Ziyuan, Kanjo, Eiman
Abstract--V ehicular ad hoc networks (V ANETs) are a crucial component of intelligent transportation systems; however, routing remains challenging due to dynamic topologies, incomplete observations, and the limited resources of edge devices. Existing reinforcement learning (RL) approaches often assume fixed graph structures and require retraining when network conditions change, making them unsuitable for deployment on constrained hardware. We present TrajA ware, an RL-based framework designed for edge AI deployment in V ANETs. TrajA ware integrates three components: (i) action space pruning, which reduces redundant neighbour options while preserving two-hop reachability, alleviating the curse of dimensionality; (ii) graph cross-attention, which maps pruned neighbours to the global graph context, producing features that generalise across diverse network sizes; and (iii) trajectory-aware prediction, which uses historical routes and junction information to estimate real-time positions under partial observations. We evaluate TrajA ware in the open-source SUMO simulator using real-world city maps with a leave-one-city-out setup. Results show that TrajA ware achieves near-shortest paths and high delivery ratios while maintaining efficiency suitable for constrained edge devices, outperforming state-of-the-art baselines in both full and partial observation scenarios. OMMUNICA TION and routing are challenging in a vehicular ad hoc network (V ANET) [1], as vehicles can observe only part of the network, and the network's structure shifts rapidly; a previously obtained observation may soon become obsolete (as shown by Figure 1). Although compared to classical software algorithms, RL routing algorithms can potentially deal with more complex objectives (e.g., optimising delay while minimising the bandwidth overhead) [2], the problems of partial observation and network dynamics put a strain on the RL routing models. Several studies have shown that graph neural networks (GNNs) generalise better on routing tasks compared to other neural networks like multilayer perceptrons (MLPs) [3]-[7]. This work will be submitted to the IEEE for possible publication. Xiaolu Fu is an AI research engineer at Unicom Data Intelligence, China Unicom, Hangzhou, China (fuxl67@chinaunicom.cn), and a former student of the Computing Department, Imperial College London, London, UK (email: andy.fu23@alumni.imperial.ac.uk). Ziyuan Bao is an independent researcher and a former MSc student of the Computing Department, Imperial College London, London, UK (email: ziyuan.bao23@alumni.imperial.ac.uk).
- Europe > United Kingdom > England > Greater London > London (0.44)
- Asia > China > Zhejiang Province > Hangzhou (0.24)
- South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
- (6 more...)
- Telecommunications (1.00)
- Transportation > Infrastructure & Services (0.88)
- Transportation > Ground > Road (0.68)
- Leisure & Entertainment > Games > Computer Games (0.46)
Evaluating Compositional Generalisation in VLMs and Diffusion Models
Pearson, Beth, Boulbarss, Bilal, Wray, Michael, Lewis, Martha
A fundamental aspect of the semantics of natural language is that novel meanings can be formed from the composition of previously known parts. Vision-language models (VLMs) have made significant progress in recent years, however, there is evidence that they are unable to perform this kind of composition. For example, given an image of a red cube and a blue cylinder, a VLM such as CLIP is likely to incorrectly label the image as a red cylinder or a blue cube, indicating it represents the image as a `bag-of-words' and fails to capture compositional semantics. Diffusion models have recently gained significant attention for their impressive generative abilities, and zero-shot classifiers based on diffusion models have been shown to perform competitively with CLIP in certain compositional tasks. In this work we explore whether the generative Diffusion Classifier has improved compositional generalisation abilities compared to discriminative models. We assess three models -- Diffusion Classifier, CLIP, and ViLT -- on their ability to bind objects with attributes and relations in both zero-shot learning (ZSL) and generalised zero-shot learning (GZSL) settings. Our results show that the Diffusion Classifier and ViLT perform well at concept binding tasks, but that all models struggle significantly with the relational GZSL task, underscoring the broader challenges VLMs face with relational reasoning. Analysis of CLIP embeddings suggests that the difficulty may stem from overly similar representations of relational concepts such as left and right. Code and dataset are available at: https://github.com/otmive/diffusion_classifier_clip
- Europe > Netherlands > North Holland > Amsterdam (0.40)
- North America > United States (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- (5 more...)