Goto

Collaborating Authors

 weighted combination



BlackboxAttacksviaSurrogateEnsembleSearch SupplementaryMaterial Summary

Neural Information Processing Systems

Some previous papers (e.g., MIM [7])claimedthat ensemble with weighted logits (equation (4) in main text) outperforms ensemble with weighted probabilities and weighted combination of loss (equations(3)and(5)in main text). In our experiments, shown in Figure 6b, we observe that weighted combination of surrogate loss functions provide similar or even higher fooling rate compared to weighted probabilities or logits.


Maximum Likelihood Training of Score-Based Diffusion Models

Neural Information Processing Systems

Score-based diffusion models synthesize samples by reversing a stochastic process that diffuses data to noise, and are trained by minimizing a weighted combination of score matching losses. The log-likelihood of score-based diffusion models can be tractably computed through a connection to continuous normalizing flows, but log-likelihood is not directly optimized by the weighted combination of score matching losses. We show that for a specific weighting scheme, the objective upper bounds the negative log-likelihood, thus enabling approximate maximum likelihood training of score-based diffusion models. We empirically observe that maximum likelihood training consistently improves the likelihood of score-based diffusion models across multiple datasets, stochastic processes, and model architectures. Our best models achieve negative log-likelihoods of 2.83 and 3.76 bits/dim on CIFAR-10 and ImageNet $32\times 32$ without any data augmentation, on a par with state-of-the-art autoregressive models on these tasks.




EMORL: Ensemble Multi-Objective Reinforcement Learning for Efficient and Flexible LLM Fine-Tuning

Kong, Lingxiao, Yang, Cong, Neufang, Susanne, Beyan, Oya Deniz, Boukhers, Zeyd

arXiv.org Artificial Intelligence

Recent advances in reinforcement learning (RL) for large language model (LLM) fine-tuning show promise in addressing multi-objective tasks but still face significant challenges, including competing objective balancing, low training efficiency, poor scalability, and limited explainability. Leveraging ensemble learning principles, we introduce an Ensemble Multi-Objective RL (EMORL) framework that fine-tunes multiple models with individual objectives while optimizing their aggregation after the fine-tuning to improve efficiency and flexibility. Our method is the first to aggregate the hidden states of individual models, incorporating contextual information from multiple objectives. This approach is supported by a hierarchical grid search algorithm that identifies optimal weighted combinations. We evaluate EMORL on counselor reflection generation tasks, using text classification models to score the generations and provide rewards during RL fine-tuning. Through comprehensive experiments on the PAIR and Psych8k datasets, we demonstrate the advantages of EMORL against existing baselines: significantly lower and more stable training consumption ($17,529\pm 1,650$ data points and $6,573\pm 147.43$ seconds), improved scalability and explainability, and comparable performance across multiple objectives.


A Transfer Learning Framework for Multilayer Networks via Model Averaging

Qiu, Yongqin, Zhang, Xinyu

arXiv.org Machine Learning

Link prediction in multilayer networks is a key challenge in applications such as recommendation systems and protein-protein interaction prediction. While many techniques have been developed, most rely on assumptions about shared structures and require access to raw auxiliary data, limiting their practicality. To address these issues, we propose a novel transfer learning framework for multilayer networks using a bi-level model averaging method. A $K$-fold cross-validation criterion based on edges is used to automatically weight inter-layer and intra-layer candidate models. This enables the transfer of information from auxiliary layers while mitigating model uncertainty, even without prior knowledge of shared structures. Theoretically, we prove the optimality and weight convergence of our method under mild conditions. Computationally, our framework is efficient and privacy-preserving, as it avoids raw data sharing and supports parallel processing across multiple servers. Simulations show our method outperforms others in predictive accuracy and robustness. We further demonstrate its practical value through two real-world recommendation system applications.


Maximum Likelihood Training of Score-Based Diffusion Models

Neural Information Processing Systems

Score-based diffusion models synthesize samples by reversing a stochastic process that diffuses data to noise, and are trained by minimizing a weighted combination of score matching losses. The log-likelihood of score-based diffusion models can be tractably computed through a connection to continuous normalizing flows, but log-likelihood is not directly optimized by the weighted combination of score matching losses. We show that for a specific weighting scheme, the objective upper bounds the negative log-likelihood, thus enabling approximate maximum likelihood training of score-based diffusion models. We empirically observe that maximum likelihood training consistently improves the likelihood of score-based diffusion models across multiple datasets, stochastic processes, and model architectures. Our best models achieve negative log-likelihoods of 2.83 and 3.76 bits/dim on CIFAR-10 and ImageNet 32\times 32 without any data augmentation, on a par with state-of-the-art autoregressive models on these tasks.


Reviews: Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion

Neural Information Processing Systems

The main algorithmic idea is a weighted combination of H step temporal differences, estimated on H steps (and rolled out by a learned model of the environment). The underlying idea is to allow the learner to tradeoff between estimation errors in model and Q function in different parts of the state-action space during learning. The updated TD estimator is incorporated into the DDPG algorithm in a straightforward manner. The update is computationally more intensive but the result is improved sample complexity. The experimental results on a variety of continuous control tasks show significant improvement over the baseline DDPG and a related method (MVE) (which is the precursor to this work). Overall, the paper is well written. The empirical results are very promising. The analysis and discussion is a bit limited but is not a major drawback. Overall, there is much to like about the paper.


Probabilistic Abduction for Visual Abstract Reasoning via Learning Rules in Vector-symbolic Architectures

Hersche, Michael, di Stefano, Francesco, Hofmann, Thomas, Sebastian, Abu, Rahimi, Abbas

arXiv.org Artificial Intelligence

reasoning is a cornerstone of human intelligence, and replicating it with artificial intelligence (AI) presents an ongoing challenge. This study focuses on efficiently solving Raven's progressive matrices (RPM), a visual test for assessing abstract reasoning abilities, by using distributed computation and operators provided by vector-symbolic architectures (VSA). Instead of hard-coding the rule formulations associated with RPMs, our approach can learn the VSA rule formulations (hence the name Learn-VRF) with just one pass through the training data. Yet, our approach, with compact parameters, remains transparent and interpretable. Learn-VRF yields accurate predictions on I-RAVEN's indistribution data, and exhibits strong out-of-distribution capabilities concerning unseen attribute-rule pairs, significantly outperforming pure connectionist baselines including large language models. Our code is available at https://github.com/