Santander
Graph neural networks for residential location choice: connection to classical logit models
Cheng, Zhanhong, Hu, Lingqian, Bu, Yuheng, Zhou, Yuqi, Wang, Shenhao
Researchers have adopted deep learning for classical discrete choice analysis as it can capture complex feature relationships and achieve higher predictive performance. However, the existing deep learning approaches cannot explicitly capture the relationship among choice alternatives, which has been a long-lasting focus in classical discrete choice models. To address the gap, this paper introduces Graph Neural Network (GNN) as a novel framework to analyze residential location choice. The GNN-based discrete choice models (GNN-DCMs) offer a structured approach for neural networks to capture dependence among spatial alternatives, while maintaining clear connections to classical random utility theory. Theoretically, we demonstrate that the GNN-DCMs incorporate the nested logit (NL) model and the spatially correlated logit (SCL) model as two specific cases, yielding novel algorithmic interpretation through message passing among alternatives' utilities. Empirically, the GNN-DCMs outperform benchmark MNL, SCL, and feedforward neural networks in predicting residential location choices among Chicago's 77 community areas. Regarding model interpretation, the GNN-DCMs can capture individual heterogeneity and exhibit spatially-aware substitution patterns. Overall, these results highlight the potential of GNN-DCMs as a unified and expressive framework for synergizing discrete choice modeling and deep learning in the complex spatial choice contexts.
Residual Feature Integration is Sufficient to Prevent Negative Transfer
Xu, Yichen, Nakada, Ryumei, Zhang, Linjun, Li, Lexin
Transfer learning typically leverages representations learned from a source domain to improve performance on a target task. A common approach is to extract features from a pre-trained model and directly apply them for target prediction. However, this strategy is prone to negative transfer where the source representation fails to align with the target distribution. In this article, we propose Residual Feature Integration (REFINE), a simple yet effective method designed to mitigate negative transfer. Our approach combines a fixed source-side representation with a trainable target-side encoder and fits a shallow neural network on the resulting joint representation, which adapts to the target domain while preserving transferable knowledge from the source domain. Theoretically, we prove that REFINE is sufficient to prevent negative transfer under mild conditions, and derive the generalization bound demonstrating its theoretical benefit. Empirically, we show that REFINE consistently enhances performance across diverse application and data modalities including vision, text, and tabular data, and outperforms numerous alternative solutions. Our method is lightweight, architecture-agnostic, and robust, making it a valuable addition to the existing transfer learning toolbox.
Metric Privacy in Federated Learning for Medical Imaging: Improving Convergence and Preventing Client Inference Attacks
Díaz, Judith Sáinz-Pardo, Athanasiou, Andreas, Jung, Kangsoo, Palamidessi, Catuscia, García, Álvaro López
Federated learning is a distributed learning technique that allows training a global model with the participation of different data owners without the need to share raw data. This architecture is orchestrated by a central server that aggregates the local models from the clients. This server may be trusted, but not all nodes in the network. Then, differential privacy (DP) can be used to privatize the global model by adding noise. However, this may affect convergence across the rounds of the federated architecture, depending also on the aggregation strategy employed. In this work, we aim to introduce the notion of metric-privacy to mitigate the impact of classical server side global-DP on the convergence of the aggregated model. Metric-privacy is a relaxation of DP, suitable for domains provided with a notion of distance. We apply it from the server side by computing a distance for the difference between the local models. We compare our approach with standard DP by analyzing the impact on six classical aggregation strategies. The proposed methodology is applied to an example of medical imaging and different scenarios are simulated across homogeneous and non-i.i.d clients. Finally, we introduce a novel client inference attack, where a semi-honest client tries to find whether another client participated in the training and study how it can be mitigated using DP and metric-privacy. Our evaluation shows that metric-privacy can increase the performance of the model compared to standard DP, while offering similar protection against client inference attacks.
Enhancing the Convergence of Federated Learning Aggregation Strategies with Limited Data
Díaz, Judith Sáinz-Pardo, García, Álvaro López
The development of deep learning techniques is a leading field applied to cases in which medical data is used, particularly in cases of image diagnosis. This type of data has privacy and legal restrictions that in many cases prevent it from being processed from central servers. However, in this area collaboration between different research centers, in order to create models as robust as possible, trained with the largest quantity and diversity of data available, is a critical point to be taken into account. In this sense, the application of privacy aware distributed architectures, such as federated learning arises. When applying this type of architecture, the server aggregates the different local models trained with the data of each data owner to build a global model. This point is critical and therefore it is fundamental to analyze different ways of aggregation according to the use case, taking into account the distribution of the clients, the characteristics of the model, etc. In this paper we propose a novel aggregation strategy and we apply it to a use case of cerebral magnetic resonance image classification. In this use case the aggregation function proposed manages to improve the convergence obtained over the rounds of the federated learning process in relation to different aggregation strategies classically implemented and applied.
Are Deep Learning Methods Suitable for Downscaling Global Climate Projections? Review and Intercomparison of Existing Models
González-Abad, Jose, Gutiérrez, José Manuel
Deep Learning (DL) has shown promise for downscaling global climate change projections under different approaches, including Perfect Prognosis (PP) and Regional Climate Model (RCM) emulation. Unlike emulators, PP downscaling models are trained on observational data, so it remains an open question whether they can plausibly extrapolate unseen conditions and changes in future emissions scenarios. Here we focus on this problem as the main drawback for the operationalization of these methods and present the results of 1) a literature review to identify state-of-the-art DL models for PP downscaling and 2) an intercomparison experiment to evaluate the performance of these models and to assess their extrapolation capability using a common experimental framework, taking into account the sensitivity of results to different training replicas. We focus on minimum and maximum temperatures and precipitation over Spain, a region with a range of climatic conditions with different influential regional processes. We conclude with a discussion of the findings, limitations of existing methods, and prospects for future development.
A Review of Deep Learning Approaches for Non-Invasive Cognitive Impairment Detection
Alsuhaibani, Muath, Fard, Ali Pourramezan, Sun, Jian, Poor, Farida Far, Pressman, Peter S., Mahoor, Mohammad H.
This review paper explores recent advances in deep learning approaches for non-invasive cognitive impairment detection. We examine various non-invasive indicators of cognitive decline, including speech and language, facial, and motoric mobility. The paper provides an overview of relevant datasets, feature-extracting techniques, and deep-learning architectures applied to this domain. We have analyzed the performance of different methods across modalities and observed that speech and language-based methods generally achieved the highest detection performance. Studies combining acoustic and linguistic features tended to outperform those using a single modality. Facial analysis methods showed promise for visual modalities but were less extensively studied. Most papers focused on binary classification (impaired vs. non-impaired), with fewer addressing multi-class or regression tasks. Transfer learning and pre-trained language models emerged as popular and effective techniques, especially for linguistic analysis. Despite significant progress, several challenges remain, including data standardization and accessibility, model explainability, longitudinal analysis limitations, and clinical adaptation. Lastly, we propose future research directions, such as investigating language-agnostic speech analysis methods, developing multi-modal diagnostic systems, and addressing ethical considerations in AI-assisted healthcare. By synthesizing current trends and identifying key obstacles, this review aims to guide further development of deep learning-based cognitive impairment detection systems to improve early diagnosis and ultimately patient outcomes.
Transformer based super-resolution downscaling for regional reanalysis: Full domain vs tiling approaches
Pérez, Antonio, Cruz, Mario Santa, Martín, Daniel San, Gutiérrez, José Manuel
Reanalysis datasets constitute the main source of spatially homogeneous information for climate analysis since they provide long records (spanning several decades) of physically consistent hourly/daily gridded data for many variables produced globally with a particular atmospheric general circulation model (AGCM) assimilating the available observations (see https://reanalyses.org for an overview of the current reanalyses). Besides the historical records, in some cases reanalyses provide near real-time information that allows monitoring the state of the climate. For instance, ERA5 [Hersbach et al., 2020] is the latest ECMWF climate reanalysis, providing hourly data on many atmospheric and land-surface parameters at 0.25º resolution, from 1940 to near real-time. However, much of this data is generated at coarse spatial resolutions, typically on the order of tens of kilometres, hampering their application for local and regional climate analysis, including extreme weather events, which often occur on smaller spatial scales. Enhancing the spatial resolution of reanalyses datasets is therefore critical for improving its utility for local-scale climate analysis and decision-making. A number of downscaling methods have been developed over the last decades for improving the spatial resolution of AGCM outputs based on two main approaches [Maraun and Widmann, 2017]: dynamical and statistical downscaling. Dynamical downscaling employs regional atmospheric models (Limited Area Models, LAMs) over limited areas of interest, driven at the boundaries by the AGCM outputs, to increase their coarse-resolution. This approach allows to solve regional/local processes and provides physically consistent results, but is limited by its high computational demands. It has been recently applied to generate regional reanalysis over continental-wide areas, such as the CERRA renalysis over Europe using the HARMONIE-ALADIN regional model (driven by ERA5) at a 5.5km resolution.