Jung, Alexander
Enforcing Fundamental Relations via Adversarial Attacks on Input Parameter Correlations
Saala, Timo, Flek, Lucie, Jung, Alexander, Karimi, Akbar, Schmidt, Alexander, Schott, Matthias, Soldin, Philipp, Wiebusch, Christopher
Correlations between input parameters play a crucial role in many scientific classification tasks, since these are often related to fundamental laws of nature. For example, in high energy physics, one of the common deep learning use-cases is the classification of signal and background processes in particle collisions. In many such cases, the fundamental principles of the correlations between observables are often better understood than the actual distributions of the observables themselves. In this work, we present a new adversarial attack algorithm called Random Distribution Shuffle Attack (RDSA), emphasizing the correlations between observables in the network rather than individual feature characteristics. Correct application of the proposed novel attack can result in a significant improvement in classification performance - particularly in the context of data augmentation - when using the generated adversaries within adversarial training. Given that correlations between input features are also crucial in many other disciplines. We demonstrate the RDSA effectiveness on six classification tasks, including two particle collision challenges (using CERN Open Data), hand-written digit recognition (MNIST784), human activity recognition (HAR), weather forecasting (Rain in Australia), and ICU patient mortality (MIMIC-IV), demonstrating a general use case beyond fundamental physics for this new type of adversarial attack algorithms.
Engineering Trustworthy AI: A Developer Guide for Empirical Risk Minimization
Pfau, Diana, Jung, Alexander
AI systems increasingly shape critical decisions across personal and societal domains. While empirical risk minimization (ERM) drives much of the AI success, it typically prioritizes accuracy over trustworthiness, often resulting in biases, opacity, and other adverse effects. This paper discusses how key requirements for trustworthy AI can be translated into design choices for the components of ERM. We hope to provide actionable guidance for building AI systems that meet emerging standards for trustworthiness of AI.
Networked Federated Multi-Task Learning
SarcheshmehPour, Yasmin, Tian, Yu, Zhang, Linli, Jung, Alexander
Many important application domains generate distributed collections of heterogeneous local datasets. These local datasets are often related via an intrinsic network structure that arises from domain-specific notions of similarity between local datasets. Different notions of similarity are induced by spatiotemporal proximity, statistical dependencies, or functional relations. We use this network structure to adaptively pool similar local datasets into nearly homogenous training sets for learning tailored models. Our main conceptual contribution is to formulate networked federated learning using the concept of generalized total variation (GTV) minimization as a regularizer. This formulation is highly flexible and can be combined with almost any parametric model including Lasso or deep neural networks. We unify and considerably extend some well-known approaches to federated multi-task learning. Our main algorithmic contribution is a novel federated learning algorithm that is well suited for distributed computing environments such as edge computing over wireless networks. This algorithm is robust against model misspecification and numerical errors arising from limited computational resources including processing time or wireless channel bandwidth. As our main technical contribution, we offer precise conditions on the local models as well on their network structure such that our algorithm learns nearly optimal local models. Our analysis reveals an interesting interplay between the (information-) geometry of local models and the (cluster-) geometry of their network.
Local Graph Clustering with Network Lasso
Jung, Alexander, SarcheshmehPour, Yasmin
We study the statistical and computational properties of a network Lasso method for local graph clustering. The clusters delivered by nLasso can be characterized elegantly via network flows between cluster boundary and seed nodes. While spectral clustering methods are guided by a minimization of the graph Laplacian quadratic form, nLasso minimizes the total variation of cluster indicator signals. As demonstrated theoretically and numerically, nLasso methods can handle very sparse clusters (chain-like) which are difficult for spectral clustering. We also verify that a primal-dual method for nonsmooth optimization allows to approximate nLasso solutions with optimal worst-case convergence rate.
Basic Principles of Clustering Methods
Jung, Alexander, Baranov, Ivan
On the Duality between Network Flows and Network Lasso
Jung, Alexander
The data arising in many application domains have an intrinsic network structure. Such network structure is computationally apprealing due to the availability of highly scalable graph algorithms. An important class of graph algorithms is related to optimizing network flows. This paper explores the duality of network flow methods and the recently proposed network Lasso. Network Lasso extends the Lasso method from sparse linear models to clustered graph signals. It turns out that the computational and statistical properties of network Lasso crucially depends on the existence of sufficiently large network flows. Using elementary tools from convex analysis, we offer a precise characterization of the duality between network Lasso and a minimum cost network flow problem. This duality provides a strong link between network Lasso methods and network flow algorithms.
Learning Networked Exponential Families with Network Lasso
Jung, Alexander
The data arising in many important big-data applications, ranging from social networks to network medicine, consist of high-dimensional data points related by an intrinsic (complex) network structure. In order to jointly leverage the information conveyed in the network structure as well as the statistical power contained in high-dimensional data points, we propose networked exponential families. We apply the network Lasso to learn networked exponential families as a probabilistic model for heterogeneous datasets with intrinsic network structure. In order to allow for accurate learning from high-dimensional data, we borrow statistical strength, via the intrinsic network structure, across the dataset. The resulting method aims at regularized empirical risk minimization using the total variation of the model parameters as regularizer. This minimization problem is a non-smooth convex optimization problem which we solve using a primal-dual splitting method. This method is appealing for big data applications as it can be implemented as a highly scalable message passing algorithm.
Classifying Partially Labeled Networked Data via Logistic Network Lasso
Tran, Nguyen, Ambos, Henrik, Jung, Alexander
We apply the network Lasso to classify partially labeled data points which are characterized by high-dimensional feature vectors. In order to learn an accurate classifier from limited amounts of labeled data, we borrow statistical strength, via an intrinsic network structure, across the dataset. The resulting logistic network Lasso amounts to a regularized empirical risk minimization problem using the total variation of a classifier as a regularizer. This minimization problem is a non-smooth convex optimization problem which we solve using a primal-dual splitting method. This method is appealing for big data applications as it can be implemented as a highly scalable message passing algorithm.
Localized Linear Regression in Networked Data
Jung, Alexander, Tran, Nguyen
The network Lasso (nLasso) has been proposed recently as an efficient learning algorithm for massive networked data sets (big data over networks). It extends the well-known least least absolute shrinkage and selection operator (Lasso) from learning sparse (generalized) linear models to network models. Efficient implementations of the nLasso have been obtained using convex optimization methods. These implementations naturally lend to highly scalable message passing methods. In this paper, we analyze the performance of nLasso when applied to localized linear regression problems involving networked data. Our main result is a sufficient conditions on the network structure and available label information such that nLasso accurately learns a localized linear regression model from few labeled data points.
Semi-supervised Learning in Network-Structured Data via Total Variation Minimization
Jung, Alexander, Hero, Alfred O. III, Mara, Alexandru, Jahromi, Saeed, Heimowitz, Ayelet, Eldar, Yonina C.
We propose and analyze a method for semi-supervised learning from partially-labeled network-structured data. Our approach is based on a graph signal recovery interpretation under a clustering hypothesis that labels of data points belonging to the same well-connected subset (cluster) are similar valued. This lends naturally to learning the labels by total variation (TV) minimization, which we solve by applying a recently proposed primal-dual method for non-smooth convex optimization. The resulting algorithm allows for a highly scalable implementation using message passing over the underlying empirical graph, which renders the algorithm suitable for big data applications. By applying tools of compressed sensing, we derive a sufficient condition on the underlying network structure such that TV minimization recovers clusters in the empirical graph of the data. In particular, we show that the proposed primal-dual method amounts to maximizing network flows over the empirical graph of the dataset. Moreover, the learning accuracy of the proposed algorithm is linked to the set of network flows between data points having known labels. The effectiveness and scalability of our approach is verified by numerical experiments.