generalisability
TrajAware: Graph Cross-Attention and Trajectory-Aware for Generalisable VANETs under Partial Observations
Fu, Xiaolu, Bao, Ziyuan, Kanjo, Eiman
Abstract--V ehicular ad hoc networks (V ANETs) are a crucial component of intelligent transportation systems; however, routing remains challenging due to dynamic topologies, incomplete observations, and the limited resources of edge devices. Existing reinforcement learning (RL) approaches often assume fixed graph structures and require retraining when network conditions change, making them unsuitable for deployment on constrained hardware. We present TrajA ware, an RL-based framework designed for edge AI deployment in V ANETs. TrajA ware integrates three components: (i) action space pruning, which reduces redundant neighbour options while preserving two-hop reachability, alleviating the curse of dimensionality; (ii) graph cross-attention, which maps pruned neighbours to the global graph context, producing features that generalise across diverse network sizes; and (iii) trajectory-aware prediction, which uses historical routes and junction information to estimate real-time positions under partial observations. We evaluate TrajA ware in the open-source SUMO simulator using real-world city maps with a leave-one-city-out setup. Results show that TrajA ware achieves near-shortest paths and high delivery ratios while maintaining efficiency suitable for constrained edge devices, outperforming state-of-the-art baselines in both full and partial observation scenarios. OMMUNICA TION and routing are challenging in a vehicular ad hoc network (V ANET) [1], as vehicles can observe only part of the network, and the network's structure shifts rapidly; a previously obtained observation may soon become obsolete (as shown by Figure 1). Although compared to classical software algorithms, RL routing algorithms can potentially deal with more complex objectives (e.g., optimising delay while minimising the bandwidth overhead) [2], the problems of partial observation and network dynamics put a strain on the RL routing models. Several studies have shown that graph neural networks (GNNs) generalise better on routing tasks compared to other neural networks like multilayer perceptrons (MLPs) [3]-[7]. This work will be submitted to the IEEE for possible publication. Xiaolu Fu is an AI research engineer at Unicom Data Intelligence, China Unicom, Hangzhou, China (fuxl67@chinaunicom.cn), and a former student of the Computing Department, Imperial College London, London, UK (email: andy.fu23@alumni.imperial.ac.uk). Ziyuan Bao is an independent researcher and a former MSc student of the Computing Department, Imperial College London, London, UK (email: ziyuan.bao23@alumni.imperial.ac.uk).
- Europe > United Kingdom > England > Greater London > London (0.44)
- Asia > China > Zhejiang Province > Hangzhou (0.24)
- South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
- (6 more...)
- Telecommunications (1.00)
- Transportation > Infrastructure & Services (0.88)
- Transportation > Ground > Road (0.68)
- Leisure & Entertainment > Games > Computer Games (0.46)
Speech-Based Depressive Mood Detection in the Presence of Multiple Sclerosis: A Cross-Corpus and Cross-Lingual Study
Gonzalez-Machorro, Monica, Reichel, Uwe, Hecker, Pascal, Hammer, Helly, Sagha, Hesam, Eyben, Florian, Hoepner, Robert, Schuller, Björn W.
Depression commonly co-occurs with neurodegenerative disorders like Multiple Sclerosis (MS), yet the potential of speech-based Artificial Intelligence for detecting depression in such contexts remains unexplored. This study examines the transferability of speech-based depression detection methods to people with MS (pwMS) through cross-corpus and cross-lingual analysis using English data from the general population and German data from pwMS. Our approach implements supervised machine learning models using: 1) conventional speech and language features commonly used in the field, 2) emotional dimensions derived from a Speech Emotion Recognition (SER) model, and 3) exploratory speech feature analysis. Despite limited data, our models detect depressive mood in pwMS with moderate generalisability, achieving a 66% Unweighted Average Recall (UAR) on a binary task. Feature selection further improved performance, boosting UAR to 74%. Our findings also highlight the relevant role emotional changes have as an indicator of depressive mood in both the general population and within PwMS. This study provides an initial exploration into generalising speech-based depression detection, even in the presence of co-occurring conditions, such as neurodegenerative diseases.
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
- Information Technology > Artificial Intelligence > Cognitive Science > Emotion (0.49)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
+VeriRel: Verification Feedback to Enhance Document Retrieval for Scientific Fact Checking
Deng, Xingyu, Wang, Xi, Stevenson, Mark
Identification of appropriate supporting evidence is critical to the success of scientific fact checking. However, existing approaches rely on off-the-shelf Information Retrieval algorithms that rank documents based on relevance rather than the evidence they provide to support or refute the claim being checked. This paper proposes +VeriRel which includes verification success in the document ranking. Experimental results on three scientific fact checking datasets (SciFact, SciFact-Open and Check-Covid) demonstrate consistently leading performance by +VeriRel for document evidence retrieval and a positive impact on downstream verification. This study highlights the potential of integrating verification feedback to document relevance assessment for effective scientific fact checking systems. It shows promising future work to evaluate fine-grained relevance when examining complex documents for advanced scientific fact checking.
- Europe > United Kingdom > England > South Yorkshire > Sheffield (0.41)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > South Korea > Seoul > Seoul (0.05)
- (15 more...)
On the Validity of Head Motion Patterns as Generalisable Depression Biomarkers
Gahalawat, Monika, Bilalpur, Maneesh, Rojas, Raul Fernandez, Cohn, Jeffrey F., Goecke, Roland, Subramanian, Ramanathan
Abstract--Depression is a debilitating mood disorder negatively impacting millions worldwide. While researchers have explored multiple verbal and non-verbal behavioural cues for automated depression assessment, head motion has received little attention thus far. Further, the common practice of validating machine learning models via a single dataset can limit model generalisability . This work examines the effectiveness and generalisability of models utilising elementary head motion units, termed kinemes, for depression severity estimation. Specifically, we consider three depression datasets from different western cultures (German: AVEC2013, Australian: Blackdog and American: Pitt datasets) with varied contextual and recording settings to investigate the generalisability of the derived kineme patterns via two methods: (i) k-fold cross-validation over individual/multiple datasets, and (ii) model reuse on other datasets. Evaluating classification and regression performance with classical machine learning methods, our results show that: (1) head motion patterns are efficient biomarkers for estimating depression severity, achieving highly competitive performance for both classification and regression tasks on a variety of datasets, including achieving the second best Mean Absolute Error (MAE) on the AVEC2013 dataset, and (2) kineme-based features are more generalisable than (a) raw head motion descriptors for binary severity classification, and (b) other visual behavioural cues for severity estimation (regression).
- North America > United States (0.14)
- Oceania > New Zealand (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- (3 more...)
Bridging the Generalisation Gap: Synthetic Data Generation for Multi-Site Clinical Model Validation
Segal, Bradley, Fieggen, Joshua, Clifton, David, Clifton, Lei
-- Ensuring the generalisability of clinical machine learning (ML) models across diverse healthcare settings remains a significant challenge due to variability in patient demographics, disease prevalence, and institutional practices. Existing model evaluation approaches often rely on real-world datasets, which are limited in availability, embed confounding biases, and lack the flexibility needed for systematic experimentation. Furthermore, while generative models aim for statistical realism, they often lack transparency and explicit control over factors driving distributional shifts. In this work, we propose a novel structured synthetic data framework designed for the controlled benchmarking of model robustness, fairness, and generalisability. Unlike approaches focused solely on mimicking observed data, our framework provides explicit control over the data generating process, including site-specific prevalence variations, hierarchical subgroup effects, and structured feature interactions. This enables targeted investigation into how models respond to specific distributional shifts and potential biases. Through controlled experiments, we demonstrate the framework's ability to isolate the impact of site variations, support fairness-aware audits, and reveal generalisation failures, particularly highlighting how model complexity interacts with site-specific effects. This work contributes a reproducible, interpretable, and configurable tool designed to advance the reliable deployment of ML in clinical settings.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Asia > China > Hong Kong (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
An exploration of features to improve the generalisability of fake news detection models
Hoy, Nathaniel, Koulouri, Theodora
Fake news poses global risks by influencing elections and spreading misinformation, making detection critical. Existing NLP and supervised Machine Learning methods perform well under cross-validation but struggle to generalise across datasets, even within the same domain. This issue stems from coarsely labelled training data, where articles are labelled based on their publisher, introducing biases that token-based models like TF-IDF and BERT are sensitive to. While Large Language Models (LLMs) offer promise, their application in fake news detection remains limited. This study demonstrates that meaningful features can still be extracted from coarsely labelled data to improve real-world robustness. Stylistic features-lexical, syntactic, and semantic-are explored due to their reduced sensitivity to dataset biases. Additionally, novel social-monetisation features are introduced, capturing economic incentives behind fake news, such as advertisements, external links, and social media elements. The study trains on the coarsely labelled NELA 2020-21 dataset and evaluates using the manually labelled Facebook URLs dataset, a gold standard for generalisability. Results highlight the limitations of token-based models trained on biased data and contribute to the scarce evidence on LLMs like LLaMa in this field. Findings indicate that stylistic and social-monetisation features offer more generalisable predictions than token-based methods and LLMs. Statistical and permutation feature importance analyses further reveal their potential to enhance performance and mitigate dataset biases, providing a path forward for improving fake news detection.
- North America > United States > Missouri (0.04)
- Europe > France > Nouvelle-Aquitaine > Gironde > Bordeaux (0.04)
- Asia > Philippines (0.04)
- Research Report > New Finding (1.00)
- Overview (1.00)
Structure based SAT dataset for analysing GNN generalisation
Fu, Yi, Tompkins, Anthony, Song, Yang, Pagnucco, Maurice
Satisfiability (SAT) solvers based on techniques such as conflict driven clause learning (CDCL) have produced excellent performance on both synthetic and real world industrial problems. While these CDCL solvers only operate on a per-problem basis, graph neural network (GNN) based solvers bring new benefits to the field by allowing practitioners to exploit knowledge gained from solved problems to expedite solving of new SAT problems. However, one specific area that is often studied in the context of CDCL solvers, but largely overlooked in GNN solvers, is the relationship between graph theoretic measure of structure in SAT problems and the generalisation ability of GNN solvers. To bridge the gap between structural graph properties (e.g., modularity, self-similarity) and the generalisability (or lack thereof) of GNN based SAT solvers, we present StructureSAT: a curated dataset, along with code to further generate novel examples, containing a diverse set of SAT problems from well known problem domains. Furthermore, we utilise a novel splitting method that focuses on deconstructing the families into more detailed hierarchies based on their structural properties. With the new dataset, we aim to help explain problematic generalisation in existing GNN SAT solvers by exploiting knowledge of structural graph properties. We conclude with multiple future directions that can help researchers in GNN based SAT solving develop more effective and generalisable SAT solvers.
- Oceania > Australia > New South Wales (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Portugal > Lisbon > Lisbon (0.04)
- Asia (0.04)
Advancing oncology with federated learning: transcending boundaries in breast, lung, and prostate cancer. A systematic review
Ankolekar, Anshu, Boie, Sebastian, Abdollahyan, Maryam, Gadaleta, Emanuela, Hasheminasab, Seyed Alireza, Yang, Guang, Beauville, Charles, Dikaios, Nikolaos, Kastis, George Anthony, Bussmann, Michael, Khalid, Sara, Kruger, Hagen, Lambin, Philippe, Papanastasiou, Giorgos
Federated Learning (FL) has emerged as a promising solution to address the limitations of centralised machine learning (ML) in oncology, particularly in overcoming privacy concerns and harnessing the power of diverse, multi-center data. This systematic review synthesises current knowledge on the state-of-the-art FL in oncology, focusing on breast, lung, and prostate cancer. Distinct from previous surveys, our comprehensive review critically evaluates the real-world implementation and impact of FL on cancer care, demonstrating its effectiveness in enhancing ML generalisability, performance and data privacy in clinical settings and data. We evaluated state-of-the-art advances in FL, demonstrating its growing adoption amid tightening data privacy regulations. FL outperformed centralised ML in 15 out of the 25 studies reviewed, spanning diverse ML models and clinical applications, and facilitating integration of multi-modal information for precision medicine. Despite the current challenges identified in reproducibility, standardisation and methodology across studies, the demonstrable benefits of FL in harnessing real-world data and addressing clinical needs highlight its significant potential for advancing cancer research. We propose that future research should focus on addressing these limitations and investigating further advanced FL methods, to fully harness data diversity and realise the transformative power of cutting-edge FL in cancer care.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Europe > United Kingdom > England > Greater London > London (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- (6 more...)
- Research Report > Experimental Study (1.00)
- Overview (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Therapeutic Area > Oncology > Prostate Cancer (0.62)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
NLP Verification: Towards a General Methodology for Certifying Robustness
Casadio, Marco, Dinkar, Tanvi, Komendantskaya, Ekaterina, Arnaboldi, Luca, Daggitt, Matthew L., Isac, Omri, Katz, Guy, Rieser, Verena, Lemon, Oliver
Deep neural networks have exhibited substantial success in the field of Natural Language Processing and ensuring their safety and reliability is crucial: there are safety critical contexts where such models must be robust to variability or attack, and give guarantees over their output. Unlike Computer Vision, NLP lacks a unified verification methodology and, despite recent advancements in literature, they are often light on the pragmatical issues of NLP verification. In this paper, we attempt to distil and evaluate general components of an NLP verification pipeline, that emerges from the progress in the field to date. Our contributions are two-fold. Firstly, we give a general (i.e. algorithm-independent) characterisation of verifiable subspaces that result from embedding sentences into continuous spaces. We identify, and give an effective method to deal with, the technical challenge of semantic generalisability of verified subspaces; and propose it as a standard metric in the NLP verification pipelines (alongside with the standard metrics of model accuracy and model verifiability). Secondly, we propose a general methodology to analyse the effect of the embedding gap -- a problem that refers to the discrepancy between verification of geometric subspaces, and the semantic meaning of sentences which the geometric subspaces are supposed to represent. In extreme cases, poor choices in embedding of sentences may invalidate verification results. We propose a number of practical NLP methods that can help to quantify the effects of the embedding gap; and in particular we propose the metric of falsifiability of semantic subspaces as another fundamental metric to be reported as part of the NLP verification pipeline. We believe that together these general principles pave the way towards a more consolidated and effective development of this new domain.
- Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
- South America > Colombia > Caldas Department > Manizales (0.04)
- Oceania > Australia > Western Australia > Perth (0.04)
- (16 more...)
- Law (1.00)
- Health & Medicine (1.00)
- Information Technology > Security & Privacy (0.93)
- Government > Regional Government > North America Government > United States Government (0.67)
Unsupervised Learning Approaches for Identifying ICU Patient Subgroups: Do Results Generalise?
Mayne, Harry, Parsons, Guy, Mahdi, Adam
The use of unsupervised learning to identify patient subgroups has emerged as a potentially promising direction to improve the efficiency of Intensive Care Units (ICUs). By identifying subgroups of patients with similar levels of medical resource need, ICUs could be restructured into a collection of smaller subunits, each catering to a specific group. However, it is unclear whether common patient subgroups exist across different ICUs, which would determine whether ICU restructuring could be operationalised in a standardised manner. In this paper, we tested the hypothesis that common ICU patient subgroups exist by examining whether the results from one existing study generalise to a different dataset. We extracted 16 features representing medical resource need and used consensus clustering to derive patient subgroups, replicating the previous study. We found limited similarities between our results and those of the previous study, providing evidence against the hypothesis. Our findings imply that there is significant variation between ICUs; thus, a standardised restructuring approach is unlikely to be appropriate. Instead, potential efficiency gains might be greater when the number and nature of the subunits are tailored to each ICU individually.
- North America > United States > California (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Asia > Middle East > Israel (0.04)
- (4 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)