Goto

Collaborating Authors

 toy example



9bc99c590be3511b8d53741684ef574c-AuthorFeedback.pdf

Neural Information Processing Systems

We thank the reviewers for the insightful comments. Due to space limitation, we only discuss major comments below. This example is shown in Fig(a) below. This has been shown for ECE (e.g., Sec. 3 of [i], pointed out by To further understand this, in Sec. D.2 we evaluate the performance of all D.1 due to its adaptive binning scheme (see We will update Sec D.1 as follows: Before giving the fooling example, we highlight that ECE is not a proper We were not able to finish the OOD experiments on time and have to do it in future work.



be appealing (R1), theoretically insightful (R2

Neural Information Processing Systems

We thank the reviewers for their insightful feedback. Can you evaluate on additional in-distribution dataset? We will definitely include these results in the final version. How to distinguish methodological differences w.r .t. prior work? The idea of using the energy score for OOD detection is novel and theoretically motivated.



9bc99c590be3511b8d53741684ef574c-AuthorFeedback.pdf

Neural Information Processing Systems

We thank the reviewers for the insightful comments. Due to space limitation, we only discuss major comments below. This example is shown in Fig(a) below. This has been shown for ECE (e.g., Sec. 3 of [i], pointed out by To further understand this, in Sec. D.2 we evaluate the performance of all D.1 due to its adaptive binning scheme (see We will update Sec D.1 as follows: Before giving the fooling example, we highlight that ECE is not a proper We were not able to finish the OOD experiments on time and have to do it in future work.


phepy: Visual Benchmarks and Improvements for Out-of-Distribution Detectors

Tyree, Juniper, Rupp, Andreas, Clusius, Petri S., Boy, Michael H.

arXiv.org Artificial Intelligence

Applying machine learning to increasingly high-dimensional problems with sparse or biased training data increases the risk that a model is used on inputs outside its training domain. For such out-of-distribution (OOD) inputs, the model can no longer make valid predictions, and its error is potentially unbounded. Testing OOD detection methods on real-world datasets is complicated by the ambiguity around which inputs are in-distribution (ID) or OOD. We design a benchmark for OOD detection, which includes three novel and easily-visualisable toy examples. These simple examples provide direct and intuitive insight into whether the detector is able to detect (1) linear and (2) non-linear concepts and (3) identify thin ID subspaces (needles) within high-dimensional spaces (haystacks). We use our benchmark to evaluate the performance of various methods from the literature. Since tactile examples of OOD inputs may benefit OOD detection, we also review several simple methods to synthesise OOD inputs for supervised training. We introduce two improvements, $t$-poking and OOD sample weighting, to make supervised detectors more precise at the ID-OOD boundary. This is especially important when conflicts between real ID and synthetic OOD sample blur the decision boundary. Finally, we provide recommendations for constructing and applying out-of-distribution detectors in machine learning.


The Unreasonable Effectiveness of Guidance for Diffusion Models

Kaiser, Tim, Adaloglou, Nikolas, Kollmann, Markus

arXiv.org Artificial Intelligence

Guidance is an error-correcting technique used to improve the perceptual quality of images generated by diffusion models. Typically, the correction is achieved by linear extrapolation, using an auxiliary diffusion model that has lower performance than the primary model. Using a 2D toy example, we show that it is highly beneficial when the auxiliary model exhibits similar errors as the primary one but stronger. We verify this finding in higher dimensions, where we show that competitive generative performance to state-of-the-art guidance methods can be achieved when the auxiliary model differs from the primary one only by having stronger weight regularization. As an independent contribution, we investigate whether upweighting long-range spatial dependencies improves visual fidelity. The result is a novel guidance method, which we call sliding window guidance (SWG), that guides the primary model with itself by constraining its receptive field. Intriguingly, SWG aligns better with human preferences than state-of-the-art guidance methods while requiring neither training, architectural modifications, nor class conditioning. The code will be released.


Reviews: Learning Latent Subspaces in Variational Autoencoders

Neural Information Processing Systems

Updated (due to rebuttal & discussion w/ R2): The authors reiterate in their rebuttal their core contributions of "extracting information beyond binary labels" and "attribute manipulation from a single image", together with the promise to clarify it in the paper. The contributions are relevant to the community, since this form of hierarchical disentangling seems novel. That said, there is some degree of similarity of the proposed variational approach to IGN (Deep Convolutional Inverse Graphics Network https://arxiv.org/abs/1503.03167). IGN is cited, but not discussed in detail, and an empirical comparison is not provided, despite being applicable to the current setting as well. Nevertheless, since the selling point of the paper seems to be the ability to discover sub-categories from only category labels, which is not addressed in IGN and is an interesting empirical find, I increased my score to be marginally above the acceptance threshold.


Reviews: Causal Inference and Mechanism Clustering of A Mixture of Additive Noise Models

Neural Information Processing Systems

This paper proposes an approach to estimate causal relationships between two variables X and Y when properties of the mechanism changes across the dataset. The authors propose and extension of the non-linear additive noise model [Hoyer et al. 2009] to the case of a mixture of a finite number of non-linear additive noise models, coined Additive Noise Model- Mixture Model (ANM-MM). The authors propose a theoretical identifiability result based on the proof of [Hoyer et al. 2009], then provide an estimation algorithm based on Gaussian Process Partially Observable Models (GPPOM), introduced as a generalization of Gaussian Process Latent Variable Models (GPLVM). Comparison of the approach to baseline for causal inference and clustering are provided on real and simulated data. The problem addressed in this paper is definitively interesting. While some of the experimental results are promising, theoretical and empirical provide a limited understanding of the approach, which is rather complex, and in particular of its strength and limitations.