Rögnvaldsson, Thorsteinn
Leveraging ChatGPT's Multimodal Vision Capabilities to Rank Satellite Images by Poverty Level: Advancing Tools for Social Science Research
Sarmadi, Hamid, Hall, Ola, Rögnvaldsson, Thorsteinn, Ohlsson, Mattias
This paper investigates the novel application of Large Language Models (LLMs) with vision capabilities to analyze satellite imagery for village-level poverty prediction. Although LLMs were originally designed for natural language understanding, their adaptability to multimodal tasks, including geospatial analysis, has opened new frontiers in data-driven research. By leveraging advancements in vision-enabled LLMs, we assess their ability to provide interpretable, scalable, and reliable insights into human poverty from satellite images. Using a pairwise comparison approach, we demonstrate that ChatGPT can rank satellite images based on poverty levels with accuracy comparable to domain experts. These findings highlight both the promise and the limitations of LLMs in socioeconomic research, providing a foundation for their integration into poverty assessment workflows. This study contributes to the ongoing exploration of unconventional data sources for welfare analysis and opens pathways for cost-effective, large-scale poverty monitoring.
Personalized Federated Learning with Contextual Modulation and Meta-Learning
Vettoruzzo, Anna, Bouguelia, Mohamed-Rafik, Rögnvaldsson, Thorsteinn
Federated learning has emerged as a promising approach for training machine learning models on decentralized data sources while preserving data privacy. However, challenges such as communication bottlenecks, heterogeneity of client devices, and non-i.i.d. data distribution pose significant obstacles to achieving optimal model performance. We propose a novel framework that combines federated learning with meta-learning techniques to enhance both efficiency and generalization capabilities. Our approach introduces a federated modulator that learns contextual information from data batches and uses this knowledge to generate modulation parameters. These parameters dynamically adjust the activations of a base model, which operates using a MAML-based approach for model personalization. Experimental results across diverse datasets highlight the improvements in convergence speed and model performance compared to existing federated learning approaches. These findings highlight the potential of incorporating contextual information and meta-learning techniques into federated learning, paving the way for advancements in distributed machine learning paradigms.
Advances and Challenges in Meta-Learning: A Technical Review
Vettoruzzo, Anna, Bouguelia, Mohamed-Rafik, Vanschoren, Joaquin, Rögnvaldsson, Thorsteinn, Santosh, KC
Meta-learning empowers learning systems with the ability to acquire knowledge from multiple tasks, enabling faster adaptation and generalization to new tasks. This review provides a comprehensive technical overview of meta-learning, emphasizing its importance in real-world applications where data may be scarce or expensive to obtain. The paper covers the state-of-the-art meta-learning approaches and explores the relationship between meta-learning and multi-task learning, transfer learning, domain adaptation and generalization, self-supervised learning, personalized federated learning, and continual learning. By highlighting the synergies between these topics and the field of meta-learning, the paper demonstrates how advancements in one area can benefit the field as a whole, while avoiding unnecessary duplication of efforts. Additionally, the paper delves into advanced meta-learning topics such as learning from complex multi-modal task distributions, unsupervised meta-learning, learning to efficiently adapt to data distribution shifts, and continual meta-learning. Lastly, the paper highlights open problems and challenges for future research in the field. By synthesizing the latest research developments, this paper provides a thorough understanding of meta-learning and its potential impact on various machine learning applications. We believe that this technical overview will contribute to the advancement of meta-learning and its practical implications in addressing real-world problems.
The Concordance Index decomposition: a measure for a deeper understanding of survival prediction models
Alabdallah, Abdallah, Ohlsson, Mattias, Pashami, Sepideh, Rögnvaldsson, Thorsteinn
The Concordance Index (C-index) is a commonly used metric in Survival Analysis to evaluate how good a prediction model is. This paper proposes a decomposition of the C-Index into a weighted harmonic mean of two quantities: one for ranking observed events versus other observed events, and the other for ranking observed events versus censored cases. This decomposition allows a more fine-grained analysis of the pros and cons of survival prediction methods. The utility of the decomposition is demonstrated using three benchmark survival analysis models (Cox Proportional Hazard, Random Survival Forest, and Deep Adversarial Time-to-Event Network) together with a new variational generative neural-network-based method (SurVED), which is also proposed in this paper. The demonstration is done on four publicly available datasets with varying censoring levels. The analysis with the C-index decomposition shows that all methods essentially perform equally well when the censoring level is high because of the dominance of the term measuring the ranking of events versus censored cases. In contrast, some methods deteriorate when the censoring level decreases because they do not rank the events versus other events well.
Transfer learning for Remaining Useful Life Prediction Based on Consensus Self-Organizing Models
Fan, Yuantao, Nowaczyk, Sławomir, Rögnvaldsson, Thorsteinn
The traditional paradigm for developing machine prognostics usually relies on generalization from data acquired in experiments under controlled conditions prior to deployment of the equipment. Detecting or predicting failures and estimating machine health in this way assumes that future field data will have a very similar distribution to the experiment data. However, many complex machines operate under dynamic environmental conditions and are used in many different ways. This makes collecting comprehensive data very challenging, and the assumption that pre-deployment data and post-deployment data follow very similar distributions is unlikely to hold. Transfer Learning (TL) refers to methods for transferring knowledge learned in one setting (the source domain) to another setting (the target domain). In this work, we present a TL method for predicting Remaining Useful Life (RUL) of equipment, under the assumption that labels are available only for the source domain and not the target domain. This setting corresponds to generalizing from a limited number of run-to-failure experiments performed prior to deployment into making prognostics with data coming from deployed equipment that is being used under multiple new operating conditions and experiencing previously unseen faults. We employ a deviation detection method, Consensus Self-Organizing Models (COSMO), to create transferable features for building the RUL regression model. These features capture how different target equipment is in comparison to its peers. The efficiency of the proposed TL method is demonstrated using the NASA Turbofan Engine Degradation Simulation Data Set. Models using the COSMO transferable features show better performance than other methods on predicting RUL when the target domain is more complex than the source domain.
Finding the Odd-One-Out in Fleets of Mechatronic Systems using Embedded Intelligent Agents
Byttner, Stefan (Halmstad University) | Svensson, Magnus (Volvo Technology) | Rögnvaldsson, Thorsteinn (Halmstad University)
With the introduction of low-cost wireless communication many new applications have been made possible; applications where systems can collaboratively learn and get wiser without human supervision. One potential application is automated monitoring for fault isolation in mobile mechatronic systems such as commercial vehicles. The paper proposes an agent design that is based on uploading software agents to a fleet of mechatronic systems. Each agent searches for interesting state representations of a system and reports them to a central server application. The states from the fleet of systems can then be used to form a consensus from which it can be possible to detect deviations and even locating a fault.