ima
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
- Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.04)
- (3 more...)
IBMA: An Imputation-Based Mixup Augmentation Using Self-Supervised Learning for Time Series Data
Nguyen, Dang Nha, Nguyen, Hai Dang, Nguyen, Khoa Tho Anh
Data augmentation plays a crucial role in enhancing model performance across various AI fields by introducing variability while maintaining the underlying temporal patterns. However, in the context of long sequence time series data, where maintaining temporal consistency is critical, there are fewer augmentation strategies compared to fields such as image or text, with advanced techniques like Mixup rarely being used. In this work, we propose a new approach, Imputation-based Mixup Augmentation (IMA), which combines Imputed-data Augmentation with Mixup Augmentation to bolster model generalization and improve forecasting performance. We evaluate the effectiveness of this method across several forecasting models, including DLinear (MLP), TimesNet (CNN), and iTrainformer (Transformer), these models represent some of the most recent advances in long sequence time series forecasting. Our experiments, conducted on three datasets (ETT -small, Illness, Exchange Rate) from various domains and compared against eight other augmentation techniques, demonstrate that IMA consistently enhances performance, achieving 22 improvements out of 24 instances, with 10 of those being the best performances, particularly with iTrain-former imputation in ETT dataset. The GitHub repository is available at: https://github.com/dangnha/IMA.
- North America > United States (0.04)
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
- Asia > Vietnam > Hồ Chí Minh City > Hồ Chí Minh City (0.04)
- (2 more...)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
APPENDIX Overview
A trivial example of an equivalence relation is equality ( =). More useful examples in the context of ICA are equivalence up to permutation, rescaling, or scalar transformation. Defining an appropriate equivalence class for the problem at hand therefore allows us to specify exactly the type of indeterminancies which cannot be resolved and up to which the true generative process can be recovered.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
- Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.04)
- (3 more...)
FRAIN to Train: A Fast-and-Reliable Solution for Decentralized Federated Learning
Park, Sanghyeon, Moon, Soo-Mook
Federated learning (FL) enables collaborative model training across distributed clients while preserving data locality. Although FedAvg pioneered synchronous rounds for global model averaging, slower devices can delay collective progress. Asynchronous FL (e.g., FedAsync) addresses stragglers by continuously integrating client updates, yet naive implementations risk client drift due to non-IID data and stale contributions. Some Blockchain-based FL approaches (e.g., BRAIN) employ robust weighting or scoring of updates to resist malicious or misaligned proposals. However, performance drops can still persist under severe data heterogeneity or high staleness, and synchronization overhead has emerged as a new concern due to its aggregator-free architectures. We introduce Fast-and-Reliable AI Network, FRAIN, a new asynchronous FL method that mitigates these limitations by incorporating two key ideas. First, our FastSync strategy eliminates the need to replay past model versions, enabling newcomers and infrequent participants to efficiently approximate the global model. Second, we adopt spherical linear interpolation (SLERP) when merging parameters, preserving models' directions and alleviating destructive interference from divergent local training. Experiments with a CNN image-classification model and a Transformer-based language model demonstrate that FRAIN achieves more stable and robust convergence than FedAvg, FedAsync, and BRAIN, especially under harsh environments: non-IID data distributions, networks that experience delays and require frequent re-synchronization, and the presence of malicious nodes.
- Asia > South Korea > Seoul > Seoul (0.04)
- Asia > Middle East > Jordan (0.04)
IMAS: A Comprehensive Agentic Approach to Rural Healthcare Delivery
Gangavarapu, Agasthya, Gangavarapu, Ananya
Since the onset of COVID-19, rural communities worldwide have faced significant challenges in accessing healthcare due to the migration of experienced medical professionals to urban centers. Semi-trained caregivers, such as Community Health Workers (CHWs) and Registered Medical Practitioners (RMPs), have stepped in to fill this gap, but often lack formal training. This paper proposes an advanced agentic medical assistant system designed to improve healthcare delivery in rural areas by utilizing Large Language Models (LLMs) and agentic approaches. The system is composed of five crucial components: translation, medical complexity assessment, expert network integration, final medical advice generation, and response simplification. Our innovative framework ensures context-sensitive, adaptive, and reliable medical assistance, capable of clinical triaging, diagnostics, and identifying cases requiring specialist intervention. The system is designed to handle cultural nuances and varying literacy levels, providing clear and actionable medical advice in local languages. Evaluation results using the MedQA, PubMedQA, and JAMA datasets demonstrate that this integrated approach significantly enhances the effectiveness of rural healthcare workers, making healthcare more accessible and understandable for underserved populations. All code and supplemental materials associated with the paper and IMAS are available at https://github.com/uheal/imas.
- Asia > India (0.04)
- Asia > British Indian Ocean Territory > Diego Garcia (0.04)
Understanding and Improving Model Averaging in Federated Learning on Heterogeneous Data
Zhou, Tailin, Lin, Zehong, Zhang, Jun, Tsang, Danny H. K.
Model averaging is a widely adopted technique in federated learning (FL) that aggregates multiple client models to obtain a global model. Remarkably, model averaging in FL can yield a superior global model, even when client models are trained with non-convex objective functions and on heterogeneous local datasets. However, the rationale behind its success remains poorly understood. To shed light on this issue, we first visualize the loss landscape of FL over client and global models to illustrate their geometric properties. The visualization shows that the client models encompass the global model within a common basin, and interestingly, the global model may deviate from the bottom of the basin while still outperforming the client models. To gain further insights into model averaging in FL, we decompose the expected loss of the global model into five factors related to the client models. Specifically, our analysis reveals that the loss of the global model after early training mainly arises from \textit{i)} the client model's loss on non-overlapping data between client datasets and the global dataset and \textit{ii)} the maximum distance between the global and client models. Based on these findings from our loss landscape visualization and loss decomposition, we propose utilizing iterative moving averaging (IMA) on the global model at the late training phase to reduce its deviation from the expected minimum, while constraining client exploration to limit the maximum distance between the global and client models. Our experiments demonstrate that incorporating IMA into existing FL methods significantly improves their accuracy and training speed on various heterogeneous data setups of benchmark datasets.
- North America > United States > Texas > Travis County > Austin (0.28)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
- Europe > Austria > Vienna (0.14)
- (21 more...)
When are Post-hoc Conceptual Explanations Identifiable?
Leemann, Tobias, Kirchhof, Michael, Rong, Yao, Kasneci, Enkelejda, Kasneci, Gjergji
Interest in understanding and factorizing learned embedding spaces through conceptual explanations is steadily growing. When no human concept labels are available, concept discovery methods search trained embedding spaces for interpretable concepts like object shape or color that can provide post-hoc explanations for decisions. Unlike previous work, we argue that concept discovery should be identifiable, meaning that a number of known concepts can be provably recovered to guarantee reliability of the explanations. As a starting point, we explicitly make the connection between concept discovery and classical methods like Principal Component Analysis and Independent Component Analysis by showing that they can recover independent concepts under non-Gaussian distributions. For dependent concepts, we propose two novel approaches that exploit functional compositionality properties of image-generating processes. Our provably identifiable concept discovery methods substantially outperform competitors on a battery of experiments including hundreds of trained models and dependent concepts, where they exhibit up to 29 % better alignment with the ground truth. Our results highlight the strict conditions under which reliable concept discovery without human labels can be guaranteed and provide a formal foundation for the domain. Our code is available online.
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
- Asia > China > Guangxi Province > Nanning (0.04)
Increasing-Margin Adversarial (IMA) Training to Improve Adversarial Robustness of Neural Networks
Deep neural networks (DNNs) are vulnerable to adversarial noises. Adversarial training is a general and effective strategy to improve DNN robustness (i.e., accuracy on noisy data) against adversarial noises. However, DNN models trained by the current existing adversarial training methods may have much lower standard accuracy (i.e., accuracy on clean data), compared to the same models trained by the standard method on clean data, and this phenomenon is known as the trade-off between accuracy and robustness and is considered unavoidable. This issue prevents adversarial training from being used in many application domains, such as medical image analysis, as practitioners do not want to sacrifice standard accuracy too much in exchange for adversarial robustness. Our objective is to lift (i.e., alleviate or even avoid) this trade-off between standard accuracy and adversarial robustness for medical image classification and segmentation. We propose a novel adversarial training method, named Increasing-Margin Adversarial (IMA) Training, which is supported by an equilibrium state analysis about the optimality of adversarial training samples. Our method aims to preserve accuracy while improving robustness by generating optimal adversarial training samples. We evaluate our method and the other eight representative methods on six publicly available image datasets corrupted by noises generated by AutoAttack and white-noise attack. Our method achieves the highest adversarial robustness for image classification and segmentation with the smallest reduction in accuracy on clean data. For one of the applications, our method improves both accuracy and robustness. Our study has demonstrated that our method can lift the trade-off between standard accuracy and adversarial robustness for the image classification and segmentation applications.
- North America > Canada > Ontario > Toronto (0.14)
- South America > Brazil (0.04)
- North America > United States (0.04)
- Asia > China (0.04)
Embrace the Gap: VAEs Perform Independent Mechanism Analysis
Reizinger, Patrik, Gresele, Luigi, Brady, Jack, von Kügelgen, Julius, Zietlow, Dominik, Schölkopf, Bernhard, Martius, Georg, Brendel, Wieland, Besserve, Michel
Variational autoencoders (VAEs) are a popular framework for modeling complex data distributions; they can be efficiently trained via variational inference by maximizing the evidence lower bound (ELBO), at the expense of a gap to the exact (log-)marginal likelihood. While VAEs are commonly used for disentangled representation learning, it is unclear why ELBO maximization would yield such representations, since unregularized maximum likelihood estimation generally cannot invert the data-generating process without additional assumptions. Yet, VAEs often succeed at this task. We seek to elucidate this apparent paradox by studying nonlinear VAEs in the limit of near-deterministic decoders. We first prove that, in this regime, the optimal encoder approximately inverts the decoder--a commonly used but unproven conjecture--which we refer to as self-consistency. Leveraging self-consistency, we show that the ELBO converges to a regularized log-likelihood. This allows VAEs to perform what has recently been termed independent mechanism analysis (IMA): it adds an inductive bias towards decoders with column-orthogonal Jacobians, which helps recovering the true latent factors. The gap between ELBO and log-likelihood is therefore welcome, since it bears unanticipated benefits for nonlinear representation learning. In experiments on synthetic and image data, we show that VAEs uncover the true latent factors when the data generating process satisfies the IMA assumption.
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > New York (0.04)
- (14 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)