responsibility problem
The Responsibility Problem in Neural Networks with Unordered Targets
Hayes, Ben, Saitis, Charalampos, Fazekas, György
We discuss the discontinuities that arise when mapping unordered objects to neural network outputs of fixed permutation, referred to as the responsibility problem. Prior work has proved the existence of the issue by identifying a single discontinuity. Here, we show that discontinuities under such models are uncountably infinite, motivating further research into neural networks for unordered data. The responsibility problem (Zhang et al., 2020b) describes an issue when training neural networks with unordered targets: the fixed permutation of output units requires that each assume a "responsibility" for some element. For feed-forward networks, the worst-case approximation of such discontinuous functions is arbitrarily poor for at least some subset of the input space (Kratsios & Zamanlooy, 2022) Empirically, degraded performance has been observed on set prediction tasks (Zhang et al., 2020a), motivating research into architectures for set generation which circumvent these discontinuities (Zhang et al., 2020a; Kosiorek et al., 2020; Rezatofighi et al., 2018).
Learning to Represent and Predict Sets with Deep Neural Networks
In this thesis, we develop various techniques for working with sets in machine learning. Each input or output is not an image or a sequence, but a set: an unordered collection of multiple objects, each object described by a feature vector. Their unordered nature makes them suitable for modeling a wide variety of data, ranging from objects in images to point clouds to graphs. Deep learning has recently shown great success on other types of structured data, so we aim to build the necessary structures for sets into deep neural networks. The first focus of this thesis is the learning of better set representations (sets as input). Existing approaches have bottlenecks that prevent them from properly modeling relations between objects within the set. To address this issue, we develop a variety of techniques for different scenarios and show that alleviating the bottleneck leads to consistent improvements across many experiments. The second focus of this thesis is the prediction of sets (sets as output). Current approaches do not take the unordered nature of sets into account properly. We determine that this results in a problem that causes discontinuity issues with many set prediction tasks and prevents them from learning some extremely simple datasets. To avoid this problem, we develop two models that properly take the structure of sets into account. Various experiments show that our set prediction techniques can significantly benefit over existing approaches.