AITopics | Zügner, Daniel

Collaborating Authors

Zügner, Daniel

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MatterGen: a generative model for inorganic materials design

Zeni, Claudio, Pinsler, Robert, Zügner, Daniel, Fowler, Andrew, Horton, Matthew, Fu, Xiang, Shysheya, Sasha, Crabbé, Jonathan, Sun, Lixin, Smith, Jake, Nguyen, Bichlien, Schulz, Hannes, Lewis, Sarah, Huang, Chin-Wei, Lu, Ziheng, Zhou, Yichi, Yang, Han, Hao, Hongxia, Li, Jielan, Tomioka, Ryota, Xie, Tian

arXiv.org Artificial IntelligenceJan-29-2024

The design of functional materials with desired properties is essential in driving technological advances in areas like energy storage, catalysis, and carbon capture. Generative models provide a new paradigm for materials design by directly generating entirely novel materials given desired property constraints. Despite recent progress, current generative models have low success rate in proposing stable crystals, or can only satisfy a very limited set of property constraints. Here, we present MatterGen, a model that generates stable, diverse inorganic materials across the periodic table and can further be fine-tuned to steer the generation towards a broad range of property constraints. To enable this, we introduce a new diffusion-based generative process that produces crystalline structures by gradually refining atom types, coordinates, and the periodic lattice. We further introduce adapter modules to enable fine-tuning towards any given property constraints with a labeled dataset. Compared to prior generative models, structures produced by MatterGen are more than twice as likely to be novel and stable, and more than 15 times closer to the local energy minimum. After fine-tuning, MatterGen successfully generates stable, novel materials with desired chemistry, symmetry, as well as mechanical, electronic and magnetic properties. Finally, we demonstrate multi-property materials design capabilities by proposing structures that have both high magnetic density and a chemical composition with low supply-chain risk. We believe that the quality of generated materials and the breadth of MatterGen's capabilities represent a major advancement towards creating a universal generative model for materials design.

machine learning, mattergen, natural language, (21 more...)

arXiv.org Artificial Intelligence

2312.03687

Country: North America > United States (0.67)

Genre: Research Report (0.64)

Industry:

Energy (1.00)
Materials (0.93)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Adversarial Training for Graph Neural Networks: Pitfalls, Solutions, and New Directions

Gosch, Lukas, Geisler, Simon, Sturm, Daniel, Charpentier, Bertrand, Zügner, Daniel, Günnemann, Stephan

arXiv.org Artificial IntelligenceDec-2-2023

Despite its success in the image domain, adversarial training did not (yet) stand out as an effective defense for Graph Neural Networks (GNNs) against graph structure perturbations. In the pursuit of fixing adversarial training (1) we show and overcome fundamental theoretical as well as practical limitations of the adopted graph learning setting in prior work; (2) we reveal that more flexible GNNs based on learnable graph diffusion are able to adjust to adversarial perturbations, while the learned message passing scheme is naturally interpretable; (3) we introduce the first attack for structure perturbations that, while targeting multiple nodes at once, is capable of handling global (graph-level) as well as local (node-level) constraints. Including these contributions, we demonstrate that adversarial training is a state-of-the-art defense against adversarial structure perturbations.

adversarial training, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2306.15427

Country: Europe > Germany (0.14)

Genre: Research Report (0.81)

Industry:

Government (0.68)
Information Technology > Security & Privacy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Training Differentially Private Graph Neural Networks with Random Walk Sampling

Ayle, Morgane, Schuchardt, Jan, Gosch, Lukas, Zügner, Daniel, Günnemann, Stephan

arXiv.org Artificial IntelligenceJan-2-2023

Deep learning models are known to put the privacy of their training data at risk, which poses challenges for their safe and ethical release to the public. Differentially private stochastic gradient descent is the de facto standard for training neural networks without leaking sensitive information about the training data. However, applying it to models for graph-structured data poses a novel challenge: unlike with i.i.d. data, sensitive information about a node in a graph cannot only leak through its gradients, but also through the gradients of all nodes within a larger neighborhood. In practice, this limits privacy-preserving deep learning on graphs to very shallow graph neural networks. We propose to solve this issue by training graph neural networks on disjoint subgraphs of a given training graph. We develop three random-walk-based methods for generating such disjoint subgraphs and perform a careful analysis of the data-generating distributions to provide strong privacy guarantees. Through extensive experiments, we show that our method greatly outperforms the state-of-the-art baseline on three large graphs, and matches or outperforms it on four smaller ones.

artificial intelligence, machine learning, node, (16 more...)

arXiv.org Artificial Intelligence

2301.00738

Country: Europe (0.28)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Add feedback

Monte Carlo EM for Deep Time Series Anomaly Detection

Aubet, François-Xavier, Zügner, Daniel, Gasthaus, Jan

arXiv.org Machine LearningDec-29-2021

Time series data are often corrupted by outliers or other kinds of anomalies. Identifying the anomalous points can be a goal on its own (anomaly detection), or a means to improving performance of other time series tasks (e.g. forecasting). Recent deep-learning-based approaches to anomaly detection and forecasting commonly assume that the proportion of anomalies in the training data is small enough to ignore, and treat the unlabeled data as coming from the nominal data distribution. We present a simple yet effective technique for augmenting existing time series models so that they explicitly account for anomalies in the training data. By augmenting the training data with a latent anomaly indicator variable whose distribution is inferred while training the underlying model using Monte Carlo EM, our method simultaneously infers anomalous points while improving model performance on nominal data. We demonstrate the effectiveness of the approach by combining it with a simple feed-forward forecasting model. We investigate how anomalies in the train set affect the training of forecasting models, which are commonly used for time series anomaly detection, and show that our method improves the training of the model.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Machine Learning

2112.14436

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Robustness of Graph Neural Networks at Scale

Geisler, Simon, Schmidt, Tobias, Şirin, Hakan, Zügner, Daniel, Bojchevski, Aleksandar, Günnemann, Stephan

arXiv.org Machine LearningOct-26-2021

Graph Neural Networks (GNNs) are increasingly important given their popularity and the diversity of applications. Yet, existing studies of their vulnerability to adversarial attacks rely on relatively small graphs. We address this gap and study how to attack and defend GNNs at scale. We propose two sparsity-aware first-order optimization attacks that maintain an efficient representation despite optimizing over a number of parameters which is quadratic in the number of nodes. We show that common surrogate losses are not well-suited for global attacks on GNNs. Our alternatives can double the attack strength. Moreover, to improve GNNs' reliability we design a robust aggregation function, Soft Median, resulting in an effective defense at all scales. We evaluate our attacks and defense with standard GNNs on graphs more than 100 times larger compared to previous work.

artificial intelligence, machine learning, optimization problem, (19 more...)

arXiv.org Machine Learning

2110.14038

Country: North America > Canada (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (0.67)
Government > Military (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback

Natural Posterior Network: Deep Bayesian Predictive Uncertainty for Exponential Family Distributions

Charpentier, Bertrand, Borchert, Oliver, Zügner, Daniel, Geisler, Simon, Günnemann, Stephan

arXiv.org Machine LearningMay-10-2021

Uncertainty awareness is crucial to develop reliable machine learning models. In this work, we propose the Natural Posterior Network (NatPN) for fast and high-quality uncertainty estimation for any task where the target distribution belongs to the exponential family. Thus, NatPN finds application for both classification and general regression settings. Unlike many previous approaches, NatPN does not require out-of-distribution (OOD) data at training time. Instead, it leverages Normalizing Flows to fit a single density on a learned low-dimensional and task-dependent latent space. For any input sample, NatPN uses the predicted likelihood to perform a Bayesian update over the target distribution. Theoretically, NatPN assigns high uncertainty far away from training data. Empirically, our extensive experiments on calibration and OOD detection show that NatPN delivers highly competitive performance for classification, regression and count prediction tasks.

deep learning, natpn, neural network, (19 more...)

arXiv.org Machine Learning

2105.04471

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Germany (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Reliable Graph Neural Networks via Robust Aggregation

Geisler, Simon, Zügner, Daniel, Günnemann, Stephan

arXiv.org Machine LearningOct-29-2020

Perturbations targeting the graph structure have proven to be extremely effective in reducing the performance of Graph Neural Networks (GNNs), and traditional defenses such as adversarial training do not seem to be able to improve robustness. This work is motivated by the observation that adversarially injected edges effectively can be viewed as additional samples to a node's neighborhood aggregation function, which results in distorted aggregations accumulating over the layers. Conventional GNN aggregation functions, such as a sum or mean, can be distorted arbitrarily by a single outlier. We propose a robust aggregation function motivated by the field of robust statistics. Our approach exhibits the largest possible breakdown point of 0.5, which means that the bias of the aggregation is bounded as long as the fraction of adversarial edges of a node is less than 50\%. Our novel aggregation function, Soft Medoid, is a fully differentiable generalization of the Medoid and therefore lends itself well for end-to-end deep learning. Equipping a GNN with our aggregation improves the robustness with respect to structure perturbations on Cora ML by a factor of 3 (and 5.5 on Citeseer) and by a factor of 8 for low-degree nodes.

deep learning, neural network, robustness, (18 more...)

arXiv.org Machine Learning

2010.15651

Country: Europe > Germany (0.14)

Genre: Research Report (1.00)

Industry:

Government (0.68)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Evaluating Robustness of Predictive Uncertainty Estimation: Are Dirichlet-based Models Reliable?

Kopetzki, Anna-Kathrin, Charpentier, Bertrand, Zügner, Daniel, Giri, Sandhya, Günnemann, Stephan

arXiv.org Machine LearningOct-28-2020

Robustness to adversarial perturbations and accurate uncertainty estimation are crucial for reliable application of deep learning in real world settings. Dirichlet-based uncertainty (DBU) models are a family of models that predict the parameters of a Dirichlet distribution (instead of a categorical one) and promise to signal when not to trust their predictions. Untrustworthy predictions are obtained on unknown or ambiguous samples and marked with a high uncertainty by the models. In this work, we show that DBU models with standard training are not robust w.r.t. three important tasks in the field of uncertainty estimation. In particular, we evaluate how useful the uncertainty estimates are to (1) indicate correctly classified samples, and (2) to detect adversarial examples that try to fool classification. We further evaluate the reliability of DBU models on the task of (3) distinguishing between in-distribution (ID) and out-of-distribution (OOD) data. To this end, we present the first study of certifiable robustness for DBU models. Furthermore, we propose novel uncertainty attacks that fool models into assigning high confidence to OOD data and low confidence to ID data, respectively. Based on our results, we explore the first approaches to make DBU models more robust. We use adversarial training procedures based on label attacks, uncertainty attacks, or random noise and demonstrate how they affect robustness of DBU models on ID data and OOD data.

deep learning, neural network, ood data, (18 more...)

arXiv.org Machine Learning

2010.14986

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Information Technology > Security & Privacy (0.46)
Transportation (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Posterior Network: Uncertainty Estimation without OOD Samples via Density-Based Pseudo-Counts

Charpentier, Bertrand, Zügner, Daniel, Günnemann, Stephan

arXiv.org Machine LearningOct-22-2020

Accurate estimation of aleatoric and epistemic uncertainty is crucial to build safe and reliable systems. Traditional approaches, such as dropout and ensemble methods, estimate uncertainty by sampling probability predictions from different submodels, which leads to slow uncertainty estimation at inference time. Recent works address this drawback by directly predicting parameters of prior distributions over the probability predictions with a neural network. While this approach has demonstrated accurate uncertainty estimation, it requires defining arbitrary target parameters for in-distribution data and makes the unrealistic assumption that out-of-distribution (OOD) data is known at training time. In this work we propose the Posterior Network (PostNet), which uses Normalizing Flows to predict an individual closed-form posterior distribution over predicted probabilites for any input sample. The posterior distributions learned by PostNet accurately reflect uncertainty for in- and out-of-distribution data -- without requiring access to OOD data at training time. PostNet achieves state-of-the art results in OOD detection and in uncertainty calibration under dataset shifts.

dataset, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

2006.09239

Country:

North America > United States (0.14)
North America > Canada (0.14)
Europe > Germany (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Oktoberfest Food Dataset

Ziller, Alexander, Hansjakob, Julius, Rusinov, Vitalii, Zügner, Daniel, Vogel, Peter, Günnemann, Stephan

arXiv.org Machine LearningNov-22-2019

We release a realistic, diverse, and challenging dataset for object detection on images. The data was recorded at a beer tent in Germany and consists of 15 different categories of food and drink items. We created more than 2,500 object annotations by hand for 1,110 images captured by a video camera above the checkout. We further make available the remaining 600GB of (unlabeled) data containing days of footage. Additionally, we provide our trained models as a benchmark. Possible applications include automated checkout systems which could significantly speed up the process.

artificial intelligence, neural network, object detection, (17 more...)

arXiv.org Machine Learning

1912.05007

Country: Europe > Germany (0.37)

Genre: Research Report (0.84)

Industry: Consumer Products & Services > Food, Beverage, Tobacco & Cannabis > Beverages (0.43)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.31)

Add feedback