AITopics | generalise

Collaborating Authors

generalise

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Evaluating alignment between humans and neural network representations in image-based learning tasks

Neural Information Processing SystemsMar-22-2026, 17:00:17 GMT

Humans represent scenes and objects in rich feature spaces, carrying information that allows us to generalise about category memberships and abstract functions with few examples. What determines whether a neural network model generalises like a human? We tested how well the representations of $86$ pretrained neural network models mapped to human learning trajectories across two tasks where humans had to learn continuous relationships and categories of natural images. In these tasks, both human participants and neural networks successfully identified the relevant stimulus features within a few trials, demonstrating effective generalisation. We found that while training dataset size was a core determinant of alignment with human choices, contrastive training with multi-modal data (text and imagery) was a common feature of currently publicly available models that predicted human generalisation. Intrinsic dimensionality of representations had different effects on alignment for different model types. Lastly, we tested three sets of human-aligned representations and found no consistent improvements in predictive accuracy compared to the baselines. In conclusion, pretrained neural networks can serve to extract representations for cognitive models, as they appear to capture some fundamental aspects of cognition that are transferable across tasks. Both our paradigms and modelling approach offer a novel way to quantify alignment between neural networks and humans and extend cognitive science into more naturalistic domains.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

55c518a17bd17dcb69aa14d69d085994-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 00:22:13 GMT

machine learning, natural language, question answering, (17 more...)

Neural Information Processing Systems

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.71)

Add feedback

Lost in Latent Space: Examining failures of disentangled models at combinatorial generalisation

Neural Information Processing SystemsFeb-8-2026, 14:04:04 GMT

Recent research has shown that generative models with highly disentangled representations fail to generalise to unseen combination of generative factor values.

artificial intelligence, generative factor, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Colorado (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

2ea6241cf767c279cf1e80a790df1885-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 02:24:55 GMT

equivariant, feature map, representation, (15 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

2ea6241cf767c279cf1e80a790df1885-Paper.pdf

Neural Information Processing SystemsOct-9-2025, 14:50:38 GMT

artificial intelligence, equivariant, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

55c518a17bd17dcb69aa14d69d085994-Paper-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 17:35:59 GMT

machine learning, natural language, question answering, (17 more...)

Neural Information Processing Systems

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.71)

Add feedback

TRA: Better Length Generalisation with Threshold Relative Attention

Opper, Mattia, Fernandez, Roland, Smolensky, Paul, Gao, Jianfeng

arXiv.org Artificial IntelligenceOct-7-2025

We test whether these limitations can be explained through two key failures of the self-attention mechanism. The first is the inability to fully remove irrelevant information. The second is tied to position, even if the dot product between a key and query is highly negative (i.e. an irrelevant key) learned positional biases may unintentionally up-weight such information - dangerous when distances become out of distribution. Put together, these two failure cases lead to compounding generalisation difficulties. We test whether they can be mitigated through the combination of a) selective sparsity - completely removing irrelevant keys from the attention softmax and b) contextualised relative distance - distance is only considered as between the query and the keys that matter. We show how refactoring the attention mechanism with these two mitigations in place can substantially improve the generalisation capabilities of decoder only transformers.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.23174

Country:

Europe (0.28)
Asia > Middle East > UAE (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

TrajAware: Graph Cross-Attention and Trajectory-Aware for Generalisable VANETs under Partial Observations

Fu, Xiaolu, Bao, Ziyuan, Kanjo, Eiman

arXiv.org Artificial IntelligenceSep-9-2025

Abstract--V ehicular ad hoc networks (V ANETs) are a crucial component of intelligent transportation systems; however, routing remains challenging due to dynamic topologies, incomplete observations, and the limited resources of edge devices. Existing reinforcement learning (RL) approaches often assume fixed graph structures and require retraining when network conditions change, making them unsuitable for deployment on constrained hardware. We present TrajA ware, an RL-based framework designed for edge AI deployment in V ANETs. TrajA ware integrates three components: (i) action space pruning, which reduces redundant neighbour options while preserving two-hop reachability, alleviating the curse of dimensionality; (ii) graph cross-attention, which maps pruned neighbours to the global graph context, producing features that generalise across diverse network sizes; and (iii) trajectory-aware prediction, which uses historical routes and junction information to estimate real-time positions under partial observations. We evaluate TrajA ware in the open-source SUMO simulator using real-world city maps with a leave-one-city-out setup. Results show that TrajA ware achieves near-shortest paths and high delivery ratios while maintaining efficiency suitable for constrained edge devices, outperforming state-of-the-art baselines in both full and partial observation scenarios. OMMUNICA TION and routing are challenging in a vehicular ad hoc network (V ANET) [1], as vehicles can observe only part of the network, and the network's structure shifts rapidly; a previously obtained observation may soon become obsolete (as shown by Figure 1). Although compared to classical software algorithms, RL routing algorithms can potentially deal with more complex objectives (e.g., optimising delay while minimising the bandwidth overhead) [2], the problems of partial observation and network dynamics put a strain on the RL routing models. Several studies have shown that graph neural networks (GNNs) generalise better on routing tasks compared to other neural networks like multilayer perceptrons (MLPs) [3]-[7]. This work will be submitted to the IEEE for possible publication. Xiaolu Fu is an AI research engineer at Unicom Data Intelligence, China Unicom, Hangzhou, China (fuxl67@chinaunicom.cn), and a former student of the Computing Department, Imperial College London, London, UK (email: andy.fu23@alumni.imperial.ac.uk). Ziyuan Bao is an independent researcher and a former MSc student of the Computing Department, Imperial College London, London, UK (email: ziyuan.bao23@alumni.imperial.ac.uk).

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2509.06665

Country:

Europe > United Kingdom > England > Greater London > London (0.44)
Asia > China > Zhejiang Province > Hangzhou (0.24)

Genre: Research Report > New Finding (0.48)

Industry:

Telecommunications (1.00)
Transportation > Infrastructure & Services (0.88)
Transportation > Ground > Road (0.68)
Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Add feedback

Evaluating Compositional Generalisation in VLMs and Diffusion Models

Pearson, Beth, Boulbarss, Bilal, Wray, Michael, Lewis, Martha

arXiv.org Artificial IntelligenceAug-29-2025

A fundamental aspect of the semantics of natural language is that novel meanings can be formed from the composition of previously known parts. Vision-language models (VLMs) have made significant progress in recent years, however, there is evidence that they are unable to perform this kind of composition. For example, given an image of a red cube and a blue cylinder, a VLM such as CLIP is likely to incorrectly label the image as a red cylinder or a blue cube, indicating it represents the image as a `bag-of-words' and fails to capture compositional semantics. Diffusion models have recently gained significant attention for their impressive generative abilities, and zero-shot classifiers based on diffusion models have been shown to perform competitively with CLIP in certain compositional tasks. In this work we explore whether the generative Diffusion Classifier has improved compositional generalisation abilities compared to discriminative models. We assess three models -- Diffusion Classifier, CLIP, and ViLT -- on their ability to bind objects with attributes and relations in both zero-shot learning (ZSL) and generalised zero-shot learning (GZSL) settings. Our results show that the Diffusion Classifier and ViLT perform well at concept binding tasks, but that all models struggle significantly with the relational GZSL task, underscoring the broader challenges VLMs face with relational reasoning. Analysis of CLIP embeddings suggests that the difficulty may stem from overly similar representations of relational concepts such as left and right. Code and dataset are available at: https://github.com/otmive/diffusion_classifier_clip

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2508.20783

Country: Europe (0.68)

Genre: Research Report > New Finding (0.54)

Technology: