disco
DISCO: Disentangled Communication Steering for Large Language Models
In contrast, we propose to inject steering vectors directly into the query and value representation spaces within attention heads. We provide evidence that a greater portion of these spaces exhibit high linear discriminability of concepts -a key property motivating the use of steering vectors-than attention head outputs. We analytically characterize the effect of our method, which we term DISentangled COmmunication (DISCO) Steering, on attention head outputs. Our analysis reveals that DISCO disentangles a strong but underutilized baseline, steering attention head inputs, which implicitly modifies queries and values in a rigid manner. In contrast, DISCO's direct modulation of these components enables more granular control. We find that DISCO achieves superior performance over a number of steering vector baselines across multiple datasets on LLaMA 3.1 8B and Gemma 2 9B, with steering efficacy scoring up to 19.1%higher than the runner-up. Our results support the conclusion that the query and value spaces are powerful building blocks for steering vector methods. Our code is publicly available at https://github.com/MaxTorop/DISCO.
DisCO: Reinforcing Large Reasoning Models with Discriminative Constrained Optimization
The recent success and openness of DeepSeek-R1 have brought widespread attention to Group Relative Policy Optimization (GRPO) as a reinforcement learning method for large reasoning models (LRMs). In this work, we analyze the GRPO objective under a binary reward setting and reveal an inherent limitation of questionlevel difficulty bias arising from its group relative advantage function. We also identify a connection between GRPO and traditional discriminative methods in supervised learning. Motivated by these insights, we introduce a new Discriminative Constrained Optimization (DisCO) framework for reinforcing LRMs, grounded in the principle of discriminative learning: increasing the scores of positive answers while decreasing those of negative ones. The main differences between DisCO and GRPO and its recent variants are: (1) it replaces the group relative objective with a discriminative objective defined by a scoring function; (2) it abandons clipping-based surrogates in favor of non-clipping RL surrogate objectives used as scoring functions; (3) it employs a simple yet effective constrained optimization approach to enforce the KL divergence constraint. As a result, DisCO offers notable advantages over GRPO and its variants: (i) it completely eliminates difficulty bias by adopting discriminative objectives; (ii) it addresses the entropy instability in GRPO and its variants through the use of non-clipping scoring functions and a constrained optimization approach, yielding long and stable training dynamics; (iii) it allows the incorporation of advanced discriminative learning techniques to address data imbalance, where a significant number of questions have more negative than positive generated answers during training. Our experiments on enhancing the mathematical reasoning capabilities of SFT-finetuned models show that DisCO significantly outperforms GRPO and its improved variants such as DAPO, achieving average gains of 7% over GRPO and 6% over DAPO across six benchmark tasks for a 1.5B model.1
DisCO: Reinforcing Large Reasoning Models with Discriminative Constrained Optimization
The recent success and openness of DeepSeek-R1 have brought widespread attention to Group Relative Policy Optimization (GRPO) as a reinforcement learning method for large reasoning models (LRMs). In this work, we analyze the GRPO objective under a binary reward setting and reveal an inherent limitation of question-level difficulty bias arising from its group relative advantage function. We also identify a connection between GRPO and traditional discriminative methods in supervised learning.
DISCO: AdversarialDefensewith LocalImplicitFunctions
In this section, we ablate the kernel size used to train DISCO on ImageNet. TableIshowsthats=3 achieves the best performance, which degrades fors = 5 by a significant margin (3.26%). This is consistent with the well known complexity of synthesizing images withglobalmodels, suchasGANs. For a single ImageNet image of size 224, STL requires 23.71 seconds while DISCO (K=1) only requires0.027. In this section, we list the url links that are used for training and evaluating DISCO.
DISCO: Adversarial Defense with Local Implicit Functions
The problem of adversarial defenses for image classification, where the goal is to robustify a classifier against adversarial examples, is considered. Inspired by the hypothesis that these examples lie beyond the natural image manifold, a novel aDversarIal defenSe with local impliCit functiOns (DISCO) is proposed to remove adversarial perturbations by localized manifold projections. DISCO consumes an adversarial image and a query pixel location and outputs a clean RGB value at the location. It is implemented with an encoder and a local implicit module, where the former produces per-pixel deep features and the latter uses the features in the neighborhood of query pixel for predicting the clean RGB value. Extensive experiments demonstrate that both DISCO and its cascade version outperform prior defenses, regardless of whether the defense is known to the attacker. DISCO is also shown to be data and parameter efficient and to mount defenses that transfers across datasets, classifiers and attacks.
Improved Sample Complexity for Incremental Autonomous Exploration in MDPs
We study the problem of exploring an unknown environment when no reward function is provided to the agent. Building on the incremental exploration setting introduced by Lim and Auer (2012), we define the objective of learning the set of $\epsilon$-optimal goal-conditioned policies attaining all states that are incrementally reachable within $L$ steps (in expectation) from a reference state $s_0$. In this paper, we introduce a novel model-based approach that interleaves discovering new states from $s_0$ and improving the accuracy of a model estimate that is used to compute goal-conditioned policies.
DISCO: A Browser-Based Privacy-Preserving Framework for Distributed Collaborative Learning
Vignoud, Julien T. T., Rousset, Valérian, Guedj, Hugo El, Aleman, Ignacio, Bennaceur, Walid, Derinbay, Batuhan Faik, Ďurech, Eduard, Gengler, Damien, Giordano, Lucas, Grimberg, Felix, Lippoldt, Franziska, Kopidaki, Christina, Liu, Jiafan, Lopata, Lauris, Maire, Nathan, Mansat, Paul, Milenkoski, Martin, Omont, Emmanuel, Özgün, Güneş, Petrović, Mina, Posa, Francesco, Ridel, Morgan, Savini, Giorgio, Torne, Marcel, Trognon, Lucas, Unell, Alyssa, Zavertiaieva, Olena, Karimireddy, Sai Praneeth, Rabbani, Tahseen, Hartley, Mary-Anne, Jaggi, Martin
Data is often impractical to share for a range of well considered reasons, such as concerns over privacy, intellectual property, and legal constraints. This not only fragments the statistical power of predictive models, but creates an accessibility bias, where accuracy becomes inequitably distributed to those who have the resources to overcome these concerns. We present DISCO: an open-source DIStributed COllaborative learning platform accessible to non-technical users, offering a means to collaboratively build machine learning models without sharing any original data or requiring any programming knowledge. DISCO's web application trains models locally directly in the browser, making our tool cross-platform out-of-the-box, including smartphones. The modular design of \disco offers choices between federated and decentralized paradigms, various levels of privacy guarantees and several approaches to weight aggregation strategies that allow for model personalization and bias resilience in the collaborative training. Code repository is available at https://github.com/epfml/disco and a showcase web interface at https://discolab.ai