disco net
DISCO Nets : DISsimilarity COefficients Networks
We present a new type of probabilistic model which we call DISsimilarity COefficient Networks (DISCO Nets). DISCO Nets allow us to efficiently sample from a posterior distribution parametrised by a neural network. During training, DISCO Nets are learned by minimising the dissimilarity coefficient between the true distribution and the estimated distribution. This allows us to tailor the training to the loss related to the task at hand. We empirically show that (i) by modeling uncertainty on the output value, DISCO Nets outperform equivalent non-probabilistic predictive networks and (ii) DISCO Nets accurately model the uncertainty of the output, outperforming existing probabilistic models based on deep neural networks.
DISCO Nets : DISsimilarity COefficients Networks
We present a new type of probabilistic model which we call DISsimilarity COefficient Networks (DISCO Nets). DISCO Nets allow us to efficiently sample from a posterior distribution parametrised by a neural network. During training, DISCO Nets are learned by minimising the dissimilarity coefficient between the true distribution and the estimated distribution. This allows us to tailor the training to the loss related to the task at hand. We empirically show that (i) by modeling uncertainty on the output value, DISCO Nets outperform equivalent non-probabilistic predictive networks and (ii) DISCO Nets accurately model the uncertainty of the output, outperforming existing probabilistic models based on deep neural networks.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
DISCO Nets : DISsimilarity COefficients Networks
We present a new type of probabilistic model which we call DISsimilarity COefficient Networks (DISCO Nets). DISCO Nets allow us to efficiently sample from a posterior distribution parametrised by a neural network. During training, DISCO Nets are learned by minimising the dissimilarity coefficient between the true distribution and the estimated distribution. This allows us to tailor the training to the loss related to the task at hand. We empirically show that (i) by modeling uncertainty on the output value, DISCO Nets outperform equivalent non-probabilistic predictive networks and (ii) DISCO Nets accurately model the uncertainty of the output, outperforming existing probabilistic models based on deep neural networks.
Reviews: DISCO Nets : DISsimilarity COefficients Networks
This paper introduces a method for solving a general class of structured prediction problems. The method trains a neural network to construct an output as a deterministic function of the real input and a sample from some noise source. Entropy in the noise source becomes entropy in the output distribution. Mismatch between the model distribution and true predictive distribution is measured using a strictly proper scoring rule, a la Gneiting and Raftery (JASA 2007). One thing that concerns me about the proposed approach is whether the "expected score" that's used for measuring dissimilarity between the model predictions and the true predictive distribution provides a strong learning signal. Especially in the minibatch setting, I'd be worried about variance in the gradient wiping out information about subtle mismatch between the model and true distributions.
Reviews: A Probabilistic U-Net for Segmentation of Ambiguous Images
Post rebuttal: Authors have responded well to the issues raised, and I champion publication of this work. Main idea: Use a conditional variational auto-encoder to produce well-calibrated segmentation hypotheses for a given input. Strengths: The application is well motivated and experiments are convincing and state of the art. Possibly in response, the manuscript is a little vague in its positioning relative to prior work. While relevant prior work is cited, the reader is left with some ambiguity and, if not familiar with this prior work, might be misled to think that there is methodological innovation beyond the specifics of architecture and application.
DISCO Nets: DISsimilarity COefficient Networks Diane Bouchacourt
We present a new type of probabilistic model which we call DISsimilarity COefficient Networks (DISCO Nets). DISCO Nets allow us to efficiently sample from a posterior distribution parametrised by a neural network. During training, DISCO Nets are learned by minimising the dissimilarity coefficient between the true distribution and the estimated distribution. This allows us to tailor the training to the loss related to the task at hand. We empirically show that (i) by modeling uncertainty on the output value, DISCO Nets outperform equivalent non-probabilistic predictive networks and (ii) DISCO Nets accurately model the uncertainty of the output, outperforming existing probabilistic models based on deep neural networks.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Sample-based Uncertainty Quantification with a Single Deterministic Neural Network
Kanazawa, Takuya, Gupta, Chetan
Development of an accurate, flexible, and numerically efficient uncertainty quantification (UQ) method is one of fundamental challenges in machine learning. Previously, a UQ method called DISCO Nets has been proposed (Bouchacourt et al., 2016), which trains a neural network by minimizing the energy score. In this method, a random noise vector in $\mathbb{R}^{10\text{--}100}$ is concatenated with the original input vector in order to produce a diverse ensemble forecast despite using a single neural network. While this method has shown promising performance on a hand pose estimation task in computer vision, it remained unexplored whether this method works as nicely for regression on tabular data, and how it competes with more recent advanced UQ methods such as NGBoost. In this paper, we propose an improved neural architecture of DISCO Nets that admits faster and more stable training while only using a compact noise vector of dimension $\sim \mathcal{O}(1)$. We benchmark this approach on miscellaneous real-world tabular datasets and confirm that it is competitive with or even superior to standard UQ baselines. Moreover we observe that it exhibits better point forecast performance than a neural network of the same size trained with the conventional mean squared error. As another advantage of the proposed method, we show that local feature importance computation methods such as SHAP can be easily applied to any subregion of the predictive distribution. A new elementary proof for the validity of using the energy score to learn predictive distributions is also provided.
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- North America > United States > California > Santa Clara County > Santa Clara (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
DISCO Nets : DISsimilarity COefficients Networks
Bouchacourt, Diane, Mudigonda, Pawan K., Nowozin, Sebastian
We present a new type of probabilistic model which we call DISsimilarity COefficient Networks (DISCO Nets). DISCO Nets allow us to efficiently sample from a posterior distribution parametrised by a neural network. During training, DISCO Nets are learned by minimising the dissimilarity coefficient between the true distribution and the estimated distribution. This allows us to tailor the training to the loss related to the task at hand. We empirically show that (i) by modeling uncertainty on the output value, DISCO Nets outperform equivalent non-probabilistic predictive networks and (ii) DISCO Nets accurately model the uncertainty of the output, outperforming existing probabilistic models based on deep neural networks.
DISCO Nets : DISsimilarity COefficients Networks
Bouchacourt, Diane, Mudigonda, Pawan K., Nowozin, Sebastian
We present a new type of probabilistic model which we call DISsimilarity COefficient Networks (DISCO Nets). DISCO Nets allow us to efficiently sample from a posterior distribution parametrised by a neural network. During training, DISCO Nets are learned by minimising the dissimilarity coefficient between the true distribution and the estimated distribution. This allows us to tailor the training to the loss related to the task at hand. We empirically show that (i) by modeling uncertainty on the output value, DISCO Nets outperform equivalent non-probabilistic predictive networks and (ii) DISCO Nets accurately model the uncertainty of the output, outperforming existing probabilistic models based on deep neural networks.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)