Goto

Collaborating Authors

 svr




Supported Value Regularization for Offline Reinforcement Learning

Neural Information Processing Systems

Offline reinforcement learning suffers from the extrapolation error and value overestimation caused by out-of-distribution (OOD) actions. To mitigate this issue, value regularization approaches aim to penalize the learned value functions to assign lower values to OOD actions. However, existing value regularization methods lack a proper distinction between the regularization effects on in-distribution (ID) and OOD actions, and fail to guarantee optimal convergence results of the policy. To this end, we propose Supported Value Regularization (SVR), which penalizes the Q-values for all OOD actions while maintaining standard Bellman updates for ID ones. Specifically, we utilize the bias of importance sampling to compute the summation of Q-values over the entire OOD region, which serves as the penalty for policy evaluation.




SupCLAP: Controlling Optimization Trajectory Drift in Audio-Text Contrastive Learning with Support Vector Regularization

Luo, Jiehui, Yin, Yuguo, Xie, Yuxin, Ru, Jinghan, Zhuang, Xianwei, He, Minghua, Liu, Aofan, Xiong, Zihan, Yang, Dongchao

arXiv.org Artificial Intelligence

Contrastive language-audio pretraining, which aims to unify multimodal representations in a shared embedding space, serves as a cornerstone for building a wide range of applications, from cross-modal retrieval to cutting-edge multimodal large language models. However, we find that the perpendicular component of the pushing force from negative samples in contrastive learning is a double-edged sword: it contains rich supplementary information from negative samples, yet its unconstrained nature causes optimization trajectory drift and training instability. To address this, we propose Support V ector Regularization (SVR), a method that introduces an auxiliary support vector to control this perpendicular component, aiming to harness its rich information while mitigating the associated trajectory drift. The efficacy of SVR is critically governed by its semantic radius, for which we explore two unsupervised modeling strategies: direct parameterization and an adaptive radius predictor module enhanced with constraints to improve its predicting accuracy. Extensive experimental results demonstrate that our method surpasses widely used baselines like InfoNCE and SigLIP loss across classification, monolingual retrieval, and multilingual retrieval on standard audio-text datasets. Contrastive Language-Audio Pretraining (CLAP) Wu et al. (2023); Ghosh et al. (2025) aims to learn a unified audio-text embedding space by pulling corresponding pairs closer and pushing others apart. This paradigm, which powers applications like cross-modal retrieval Xie et al. (2024) and multimodal LLMs Xue et al. (2024); Lam et al. (2025), has achieved great empirical success. However, standard InfoNCE-based CLAP methods still struggle to learn ideal representations, facing limitations such as poor temporal alignment of audio events Y uan et al. (2024) and inconsistent multilingual alignment Yin et al. (2025). Therefore, achieving optimal alignment between the language and audio representation spaces remains an open challenge. In this paper, we uncover a complex yet overlooked dynamic in the optimization process of standard InfoNCE-based contrastive learning Wu et al. (2021): optimization trajectory drift.


Resampling strategies for imbalanced regression: a survey and empirical analysis

Avelino, Juscimara G., Cavalcanti, George D. C., Cruz, Rafael M. O.

arXiv.org Artificial Intelligence

Imbalanced problems can arise in different real-world situations, and to address this, certain strategies in the form of resampling or balancing algorithms are proposed. This issue has largely been studied in the context of classification, and yet, the same problem features in regression tasks, where target values are continuous. This work presents an extensive experimental study comprising various balancing and predictive models, and wich uses metrics to capture important elements for the user and to evaluate the predictive model in an imbalanced regression data context. It also proposes a taxonomy for imbalanced regression approaches based on three crucial criteria: regression model, learning process, and evaluation metrics. The study offers new insights into the use of such strategies, highlighting the advantages they bring to each model's learning process, and indicating directions for further studies. The code, data and further information related to the experiments performed herein can be found on GitHub: https://github.com/JusciAvelino/imbalancedRegression.


Secure and Storage-Efficient Deep Learning Models for Edge AI Using Automatic Weight Generation

Rahaman, Habibur, Chatterjee, Atri, Bhunia, Swarup

arXiv.org Artificial Intelligence

Complex neural networks require substantial memory to store a large number of synaptic weights. This work introduces WINGs (Automatic Weight Generator for Secure and Storage-Efficient Deep Learning Models), a novel framework that dynamically generates layer weights in a fully connected neural network (FC) and compresses the weights in convolutional neural networks (CNNs) during inference, significantly reducing memory requirements without sacrificing accuracy. WINGs framework uses principal component analysis (PCA) for dimensionality reduction and lightweight support vector regression (SVR) models to predict layer weights in the FC networks, removing the need for storing full-weight matrices and achieving substantial memory savings. It also preferentially compresses the weights in low-sensitivity layers of CNNs using PCA and SVR with sensitivity analysis. The sensitivity-aware design also offers an added level of security, as any bit-flip attack with weights in compressed layers has an amplified and readily detectable effect on accuracy. WINGs achieves 53x compression for the FC layers and 28x for AlexNet with MNIST dataset, and 18x for Alexnet with CIFAR-10 dataset with 1-2% accuracy loss. This significant reduction in memory results in higher throughput and lower energy for DNN inference, making it attractive for resource-constrained edge applications.


Augmented Regression Models using Neurochaos Learning

Henry, Akhila, Nagaraj, Nithin

arXiv.org Artificial Intelligence

This study presents novel Augmented Regression Models using Neurochaos Learning (NL), where Tracemean features derived from the Neurochaos Learning framework are integrated with traditional regression algorithms : Linear Regression, Ridge Regression, Lasso Regression, and Support Vector Regression (SVR). Our approach was evaluated using ten diverse real-life datasets and a synthetically generated dataset of the form $y = mx + c + ε$. Results show that incorporating the Tracemean feature (mean of the chaotic neural traces of the neurons in the NL architecture) significantly enhances regression performance, particularly in Augmented Lasso Regression and Augmented SVR, where six out of ten real-life datasets exhibited improved predictive accuracy. Among the models, Augmented Chaotic Ridge Regression achieved the highest average performance boost (11.35 %). Additionally, experiments on the simulated dataset demonstrated that the Mean Squared Error (MSE) of the augmented models consistently decreased and converged towards the Minimum Mean Squared Error (MMSE) as the sample size increased. This work demonstrates the potential of chaos-inspired features in regression tasks, offering a pathway to more accurate and computationally efficient prediction models.


Risk Analysis and Design Against Adversarial Actions

Campi, Marco C., Carè, Algo, Crespo, Luis G., Garatti, Simone, Ramponi, Federico A.

arXiv.org Machine Learning

In particular, Theorem 5 applies when null A δ = { δ }, i.e., when θ null A is just a standard, non-robust, solution. This is different from [56], whose main result is only applicable to solutions satisfying the infinitely many constraints f (θ, δ) 0, δ A δ i, i = 1,...,N, where A δ i is tuned to the Wasserstein bound. As previously noted, R plays the role of a tunable parameter, and the result in Theorem 5 holds for any choice of the value ofR . As a consequence, the user can play with R to optimize the bound on Risk ( θ null A) given in Theorem 5. As R increases, s A, null A (and, thereby, ε (s A, null A)) tends to increase while µ/R diminishes. While the best compromise is difficult to foresee, one can experimentally try various choices R 1 < R 2 < < R i < R h and select the one giving the best result. The corresponding confidence level can be bounded as follows: P Nnull D: Risk (θ null A) > ε (s A, null A,i) + µ R i for at least one i { 1,...h } null h null i =1P Nnull D: Risk (θ null A) > ε (s A, null A,i) + µ R i null h null i =1β = hβ, 29 from which P Nnull D: Risk ( θ null A) ε ( s A, null A,i) + µ R i for all i = 1,...h null 1 hβ.