Goto

Collaborating Authors

 disco



DISCO: AdversarialDefensewith LocalImplicitFunctions

Neural Information Processing Systems

In this section, we ablate the kernel size used to train DISCO on ImageNet. TableIshowsthats=3 achieves the best performance, which degrades fors = 5 by a significant margin (3.26%). This is consistent with the well known complexity of synthesizing images withglobalmodels, suchasGANs. For a single ImageNet image of size 224, STL requires 23.71 seconds while DISCO (K=1) only requires0.027. In this section, we list the url links that are used for training and evaluating DISCO.



81e793dc8317a3dbc3534ed3f242c418-Supplemental.pdf

Neural Information Processing Systems

Leveraging themodel-based nature ofDisCo,wecanalso readily compute anε/cmin-optimal policy for any cost-sensitive shortest-path problem defined on theL-controllable states with minimum costcmin.


DISCO: Adversarial Defense with Local Implicit Functions

Neural Information Processing Systems

The problem of adversarial defenses for image classification, where the goal is to robustify a classifier against adversarial examples, is considered. Inspired by the hypothesis that these examples lie beyond the natural image manifold, a novel aDversarIal defenSe with local impliCit functiOns (DISCO) is proposed to remove adversarial perturbations by localized manifold projections. DISCO consumes an adversarial image and a query pixel location and outputs a clean RGB value at the location. It is implemented with an encoder and a local implicit module, where the former produces per-pixel deep features and the latter uses the features in the neighborhood of query pixel for predicting the clean RGB value. Extensive experiments demonstrate that both DISCO and its cascade version outperform prior defenses, regardless of whether the defense is known to the attacker. DISCO is also shown to be data and parameter efficient and to mount defenses that transfers across datasets, classifiers and attacks.


Improved Sample Complexity for Incremental Autonomous Exploration in MDPs

Neural Information Processing Systems

We study the problem of exploring an unknown environment when no reward function is provided to the agent. Building on the incremental exploration setting introduced by Lim and Auer (2012), we define the objective of learning the set of $\epsilon$-optimal goal-conditioned policies attaining all states that are incrementally reachable within $L$ steps (in expectation) from a reference state $s_0$. In this paper, we introduce a novel model-based approach that interleaves discovering new states from $s_0$ and improving the accuracy of a model estimate that is used to compute goal-conditioned policies.


DISCO: A Browser-Based Privacy-Preserving Framework for Distributed Collaborative Learning

Vignoud, Julien T. T., Rousset, Valérian, Guedj, Hugo El, Aleman, Ignacio, Bennaceur, Walid, Derinbay, Batuhan Faik, Ďurech, Eduard, Gengler, Damien, Giordano, Lucas, Grimberg, Felix, Lippoldt, Franziska, Kopidaki, Christina, Liu, Jiafan, Lopata, Lauris, Maire, Nathan, Mansat, Paul, Milenkoski, Martin, Omont, Emmanuel, Özgün, Güneş, Petrović, Mina, Posa, Francesco, Ridel, Morgan, Savini, Giorgio, Torne, Marcel, Trognon, Lucas, Unell, Alyssa, Zavertiaieva, Olena, Karimireddy, Sai Praneeth, Rabbani, Tahseen, Hartley, Mary-Anne, Jaggi, Martin

arXiv.org Artificial Intelligence

Data is often impractical to share for a range of well considered reasons, such as concerns over privacy, intellectual property, and legal constraints. This not only fragments the statistical power of predictive models, but creates an accessibility bias, where accuracy becomes inequitably distributed to those who have the resources to overcome these concerns. We present DISCO: an open-source DIStributed COllaborative learning platform accessible to non-technical users, offering a means to collaboratively build machine learning models without sharing any original data or requiring any programming knowledge. DISCO's web application trains models locally directly in the browser, making our tool cross-platform out-of-the-box, including smartphones. The modular design of \disco offers choices between federated and decentralized paradigms, various levels of privacy guarantees and several approaches to weight aggregation strategies that allow for model personalization and bias resilience in the collaborative training. Code repository is available at https://github.com/epfml/disco and a showcase web interface at https://discolab.ai


DISCO: Diversifying Sample Condensation for Efficient Model Evaluation

Rubinstein, Alexander, Raible, Benjamin, Gubri, Martin, Oh, Seong Joon

arXiv.org Artificial Intelligence

Evaluating modern machine learning models has become prohibitively expensive. Benchmarks such as LMMs-Eval and HELM demand thousands of GPU hours per model. Costly evaluation reduces inclusivity, slows the cycle of innovation, and worsens environmental impact. The typical approach follows two steps. First, select an anchor subset of data. Second, train a mapping from the accuracy on this subset to the final test result. The drawback is that anchor selection depends on clustering, which can be complex and sensitive to design choices. We argue that promoting diversity among samples is not essential; what matters is to select samples that $\textit{maximise diversity in model responses}$. Our method, $\textbf{Diversifying Sample Condensation (DISCO)}$, selects the top-k samples with the greatest model disagreements. This uses greedy, sample-wise statistics rather than global clustering. The approach is conceptually simpler. From a theoretical view, inter-model disagreement provides an information-theoretically optimal rule for such greedy selection. $\textbf{DISCO}$ shows empirical gains over prior methods, achieving state-of-the-art results in performance prediction across MMLU, Hellaswag, Winogrande, and ARC. Code is available here: https://github.com/arubique/disco-public.


LPI-RIT at LeWiDi-2025: Improving Distributional Predictions via Metadata and Loss Reweighting with DisCo

Sawkar, Mandira, Shetty, Samay U., Pandita, Deepak, Weerasooriya, Tharindu Cyril, Homan, Christopher M.

arXiv.org Artificial Intelligence

The Learning With Disagreements (LeWiDi) 2025 shared task aims to model annotator disagreement through soft label distribution prediction and perspectivist evaluation, which focuses on modeling individual annotators. We adapt DisCo (Distribution from Context), a neural architecture that jointly models item-level and annotator-level label distributions, and present detailed analysis and improvements. In this paper, we extend DisCo by introducing annotator metadata embeddings, enhancing input representations, and multi-objective training losses to capture disagreement patterns better. Through extensive experiments, we demonstrate substantial improvements in both soft and perspectivist evaluation metrics across three datasets. We also conduct in-depth calibration and error analyses that reveal when and why disagreement-aware modeling improves. Our findings show that disagreement can be better captured by conditioning on annotator demographics and by optimizing directly for distributional metrics, yielding consistent improvements across datasets.