Goto

Collaborating Authors

 adaptive loss function


Quantifying Multimodal Imbalance: A GMM-Guided Adaptive Loss for Audio-Visual Learning

Liu, Zhaocheng, Yu, Zhiwen, Liu, Xiaoqing

arXiv.org Artificial Intelligence

The heterogeneity of multimodal data leads to inconsistencies and imbalance, allowing a dominant modality to steer gradient updates. Existing solutions mainly focus on optimization- or data-based strategies but rarely exploit the information inherent in multimodal imbalance or conduct its quantitative analysis. To address this gap, we propose a novel quantitative analysis framework for Multimodal Imbalance and design a sample-level adaptive loss function. We define the Modality Gap as the Softmax score difference between modalities for the correct class and model its distribution using a bimodal Gaussian Mixture Model(GMM), representing balanced and imbalanced samples. Using Bayes' theorem, we estimate each sample's posterior probability of belonging to these two groups. Based on this, our adaptive loss (1) minimizes the overall Modality Gap, (2) aligns imbalanced samples with balanced ones, and (3) adaptively penalizes each according to its imbalance degree. A two-stage training strategy-warm-up and adaptive phases,yields state-of-the-art performance on CREMA-D (80.65%), AVE (70.40%), and KineticSound (72.42%). Fine-tuning with high-quality samples identified by the GMM further improves results, highlighting their value for effective multimodal fusion.


FedDUAL: A Dual-Strategy with Adaptive Loss and Dynamic Aggregation for Mitigating Data Heterogeneity in Federated Learning

Sahoo, Pranab, Tripathi, Ashutosh, Saha, Sriparna, Mondal, Samrat

arXiv.org Artificial Intelligence

Federated Learning (FL) marks a transformative approach to distributed model training by combining locally optimized models from various clients into a unified global model. While FL preserves data privacy by eliminating centralized storage, it encounters significant challenges such as performance degradation, slower convergence, and reduced robustness of the global model due to the heterogeneity in client data distributions. Among the various forms of data heterogeneity, label skew emerges as a particularly formidable and prevalent issue, especially in domains such as image classification. To address these challenges, we begin with comprehensive experiments to pinpoint the underlying issues in the FL training process. Based on our findings, we then introduce an innovative dual-strategy approach designed to effectively resolve these issues. First, we introduce an adaptive loss function for client-side training, meticulously crafted to preserve previously acquired knowledge while maintaining an optimal equilibrium between local optimization and global model coherence. Secondly, we develop a dynamic aggregation strategy for aggregating client models at the server. This approach adapts to each client's unique learning patterns, effectively addressing the challenges of diverse data across the network. Our comprehensive evaluation, conducted across three diverse real-world datasets, coupled with theoretical convergence guarantees, demonstrates the superior efficacy of our method compared to several established state-of-the-art approaches.


OWAdapt: An adaptive loss function for deep learning using OWA operators

Maldonado, Sebastián, Vairetti, Carla, Jara, Katherine, Carrasco, Miguel, López, Julio

arXiv.org Artificial Intelligence

In this paper, we propose a fuzzy adaptive loss function for enhancing deep learning performance in classification tasks. Specifically, we redefine the cross-entropy loss to effectively address class-level noise conditions, including the challenging problem of class imbalance. Our approach introduces aggregation operators, leveraging the power of fuzzy logic to improve classification accuracy. The rationale behind our proposed method lies in the iterative up-weighting of class-level components within the loss function, focusing on those with larger errors. To achieve this, we employ the ordered weighted average (OWA) operator and combine it with an adaptive scheme for gradient-based learning. Through extensive experimentation, our method outperforms other commonly used loss functions, such as the standard cross-entropy or focal loss, across various binary and multiclass classification tasks. Furthermore, we explore the influence of hyperparameters associated with the OWA operators and present a default configuration that performs well across different experimental settings.


Adaptive Hybrid Model for Enhanced Stock Market Predictions Using Improved VMD and Stacked Informer

Zhang, Jianan, Duan, Hongyi

arXiv.org Artificial Intelligence

Financial markets play a pivotal role in global economic activities, and their operations and dynamic evolutions are intricately linked to a myriad of chaotic and complex factors, including economic configurations, seasonal components, and the international milieu [1] [2]. As the economy progresses and financial markets expand continuously, time series analysis in finance has become indispensable [3]. This analytical approach has significantly advanced the understanding of market dynamics, refined intelligent decision-making processes, and bolstered developments in forecasting investment returns [4][2]. Consequently, it has garnered immense scholarly attention, leading to abundant research contributions in this domain. In stark contrast to conventional time series prediction endeavors characterizing various scientific domains--such as the temporal allocation mechanisms associated with wind energy integration [5], the granular analysis of protracted energy consumption patterns in architectural structures [6], or the intricate forecasting of load dynamics within thermal frameworks [7]--the sphere of financial time series forecasting is imbued with an elevated level of complexity and unpredictability.


A generalized forecasting solution to enable future insights of COVID-19 at sub-national level resolutions

Marikkar, Umar, Weligampola, Harshana, Perera, Rumali, Hassan, Jameel, Sritharan, Suren, Jayatilaka, Gihan, Godaliyadda, Roshan, Herath, Vijitha, Ekanayake, Parakrama, Ekanayake, Janaka, Rathnayake, Anuruddhika, Dharmaratne, Samath

arXiv.org Artificial Intelligence

COVID-19 continues to cause a significant impact on public health. To minimize this impact, policy makers undertake containment measures that however, when carried out disproportionately to the actual threat, as a result if errorneous threat assessment, cause undesirable long-term socio-economic complications. In addition, macro-level or national level decision making fails to consider the localized sensitivities in small regions. Hence, the need arises for region-wise threat assessments that provide insights on the behaviour of COVID-19 through time, enabled through accurate forecasts. In this study, a forecasting solution is proposed, to predict daily new cases of COVID-19 in regions small enough where containment measures could be locally implemented, by targeting three main shortcomings that exist in literature; the unreliability of existing data caused by inconsistent testing patterns in smaller regions, weak deploy-ability of forecasting models towards predicting cases in previously unseen regions, and model training biases caused by the imbalanced nature of data in COVID-19 epi-curves. Hence, the contributions of this study are three-fold; an optimized smoothing technique to smoothen less deterministic epi-curves based on epidemiological dynamics of that region, a Long-Short-Term-Memory (LSTM) based forecasting model trained using data from select regions to create a representative and diverse training set that maximizes deploy-ability in regions with lack of historical data, and an adaptive loss function whilst training to mitigate the data imbalances seen in epi-curves. The proposed smoothing technique, the generalized training strategy and the adaptive loss function largely increased the overall accuracy of the forecast, which enables efficient containment measures at a more localized micro-level.