Goto

Collaborating Authors

 Weber, Pascal


DISCO: Internal Evaluation of Density-Based Clustering

arXiv.org Machine Learning

In density-based clustering, clusters are areas of high object density separated by lower object density areas. This notion supports arbitrarily shaped clusters and automatic detection of noise points that do not belong to any cluster. However, it is challenging to adequately evaluate the quality of density-based clustering results. Even though some existing cluster validity indices (CVIs) target arbitrarily shaped clusters, none of them captures the quality of the labeled noise. In this paper, we propose DISCO, a Density-based Internal Score for Clustering Outcomes, which is the first CVI that also evaluates the quality of noise labels. DISCO reliably evaluates density-based clusters of arbitrary shape by assessing compactness and separation. It also introduces a direct assessment of noise labels for any given clustering. Our experiments show that DISCO evaluates density-based clusterings more consistently than its competitors. It is additionally the first method to evaluate the complete labeling of density-based clustering methods, including noise labels.


I Want 'Em All (At Once) -- Ultrametric Cluster Hierarchies

arXiv.org Artificial Intelligence

Hierarchical clustering is a powerful tool for exploratory data analysis, organizing data into a tree of clusterings from which a partition can be chosen. This paper generalizes these ideas by proving that, for any reasonable hierarchy, one can optimally solve any center-based clustering objective over it (such as $k$-means). Moreover, these solutions can be found exceedingly quickly and are themselves necessarily hierarchical. Thus, given a cluster tree, we show that one can quickly access a plethora of new, equally meaningful hierarchies. Just as in standard hierarchical clustering, one can then choose any desired partition from these new hierarchies. We conclude by verifying the utility of our proposed techniques across datasets, hierarchies, and partitioning schemes.


SHADE: Deep Density-based Clustering

arXiv.org Artificial Intelligence

Detecting arbitrarily shaped clusters in high-dimensional noisy data is challenging for current clustering methods. We introduce SHADE (Structure-preserving High-dimensional Analysis with Density-based Exploration), the first deep clustering algorithm that incorporates density-connectivity into its loss function. Similar to existing deep clustering algorithms, SHADE supports high-dimensional and large data sets with the expressive power of a deep autoencoder. In contrast to most existing deep clustering methods that rely on a centroid-based clustering objective, SHADE incorporates a novel loss function that captures density-connectivity. SHADE thereby learns a representation that enhances the separation of density-connected clusters. SHADE detects a stable clustering and noise points fully automatically without any user input. It outperforms existing methods in clustering quality, especially on data that contain non-Gaussian clusters, such as video data. Moreover, the embedded space of SHADE is suitable for visualization and interpretation of the clustering results as the individual shapes of the clusters are preserved.


Discovering Individual Rewards in Collective Behavior through Inverse Multi-Agent Reinforcement Learning

arXiv.org Artificial Intelligence

The discovery of individual objectives in collective behavior of complex dynamical systems such as fish schools and bacteria colonies is a long-standing challenge. Inverse reinforcement learning is a potent approach for addressing this challenge but its applicability to dynamical systems, involving continuous state-action spaces and multiple interacting agents, has been limited. In this study, we tackle this challenge by introducing an off-policy inverse multi-agent reinforcement learning algorithm (IMARL). Our approach combines the ReF-ER techniques with guided cost learning. By leveraging demonstrations, our algorithm automatically uncovers the reward function and learns an effective policy for the agents. Through extensive experimentation, we demonstrate that the proposed policy captures the behavior observed in the provided data, and achieves promising results across problem domains including single agent models in the OpenAI gym and multi-agent models of schooling behavior. The present study shows that the proposed IMARL algorithm is a significant step towards understanding collective dynamics from the perspective of its constituents, and showcases its value as a tool for studying complex physical systems exhibiting collective behaviour.


Deep Clustering With Consensus Representations

arXiv.org Artificial Intelligence

The field of deep clustering combines deep learning and clustering to learn representations that improve both the learned representation and the performance of the considered clustering method. Most existing deep clustering methods are designed for a single clustering method, e.g., k-means, spectral clustering, or Gaussian mixture models, but it is well known that no clustering algorithm works best in all circumstances. Consensus clustering tries to alleviate the individual weaknesses of clustering algorithms by building a consensus between members of a clustering ensemble. Currently, there is no deep clustering method that can include multiple heterogeneous clustering algorithms in an ensemble to update representations and clusterings together. To close this gap, we introduce the idea of a consensus representation that maximizes the agreement between ensemble members. Further, we propose DECCS (Deep Embedded Clustering with Consensus representationS), a deep consensus clustering method that learns a consensus representation by enhancing the embedded space to such a degree that all ensemble members agree on a common clustering result. Our contributions are the following: (1) We introduce the idea of learning consensus representations for heterogeneous clusterings, a novel notion to approach consensus clustering. (2) We propose DECCS, the first deep clustering method that jointly improves the representation and clustering results of multiple heterogeneous clustering algorithms. (3) We show in experiments that learning a consensus representation with DECCS is outperforming several relevant baselines from deep clustering and consensus clustering. Our code can be found at https://gitlab.cs.univie.ac.at/lukas/deccs


Learning swimming escape patterns under energy constraints

arXiv.org Artificial Intelligence

Aquatic organisms involved in predator-prey interactions perform impressive feats of fluid manipulation to enhance their chances of survival [1-8]. Since early studies where prey fish were reported to rapidly accelerate from rest by bending into a C-shape and unfurling their tail [9-12], impulsive locomotion patterns have been the subject of intense investigation. Studying escape strategies of prey fish has led to the discovery of sensing mechanisms [13-15], dedicated neural circuits [16-19], and bio-mechanic principles [20, 21]. From the perspective of hydrodynamics, several studies have sought to understand the C-start escape response and how it imparts momentum to the surrounding fluid [22-27]. However, experiments and observations indicate that swimming escapes can take a variety of forms. For example, after the initial burst from rest, many fish are seen coasting instead of swimming continuously [11, 28, 29]. Furthermore, theoretical [30-32] as well as experimental [33] studies have suggested that intermittent swimming styles, termed burst-coast swimming, can be more efficient than continuous swimming when maximizing distance given a fixed amount of energy. This raises the question of when and why different swimming escape patterns are employed in nature, and which biophysical cost functions they optimize. Given a cost function, reverse engineering methodologies have been employed to identify links to resulting swimming patterns e.g.