AITopics | Mandt, Stephan

Collaborating Authors

Mandt, Stephan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Understanding and Visualizing Droplet Distributions in Simulations of Shallow Clouds

Will, Justus C., Jenney, Andrea M., Lamb, Kara D., Pritchard, Michael S., Kaul, Colleen, Ma, Po-Lun, Pressel, Kyle, Shpund, Jacob, van Lier-Walqui, Marcus, Mandt, Stephan

arXiv.org Artificial IntelligenceOct-31-2023

Thorough analysis of local droplet-level interactions is crucial to better understand the microphysical processes in clouds and their effect on the global climate. High-accuracy simulations of relevant droplet size distributions from Large Eddy Simulations (LES) of bin microphysics challenge current analysis techniques due to their high dimensionality involving three spatial dimensions, time, and a continuous range of droplet sizes. Utilizing the compact latent representations from Variational Autoencoders (VAEs), we produce novel and intuitive visualizations for the organization of droplet sizes and their evolution over time beyond what is possible with clustering techniques. This greatly improves interpretation and allows us to examine aerosol-cloud interactions by contrasting simulations with different aerosol concentrations. We find that the evolution of the droplet spectrum is similar across aerosol levels but occurs at different paces. This similarity suggests that precipitation initiation processes are alike despite variations in onset times.

artificial intelligence, machine learning, simulation, (14 more...)

arXiv.org Artificial Intelligence

2310.20168

Country: North America > United States (0.94)

Genre: Research Report (1.00)

Industry:

Energy (0.95)
Government > Regional Government > North America Government > United States Government (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.49)

Add feedback

Estimating the Rate-Distortion Function by Wasserstein Gradient Descent

Yang, Yibo, Eckstein, Stephan, Nutz, Marcel, Mandt, Stephan

arXiv.org Machine LearningOct-29-2023

In the theory of lossy compression, the rate-distortion (R-D) function $R(D)$ describes how much a data source can be compressed (in bit-rate) at any given level of fidelity (distortion). Obtaining $R(D)$ for a given data source establishes the fundamental performance limit for all compression algorithms. We propose a new method to estimate $R(D)$ from the perspective of optimal transport. Unlike the classic Blahut--Arimoto algorithm which fixes the support of the reproduction distribution in advance, our Wasserstein gradient descent algorithm learns the support of the optimal reproduction distribution by moving particles. We prove its local convergence and analyze the sample complexity of our R-D estimator based on a connection to entropic optimal transport. Experimentally, we obtain comparable or tighter bounds than state-of-the-art neural network methods on low-rate sources while requiring considerably less tuning and computation effort. We also highlight a connection to maximum-likelihood deconvolution and introduce a new class of sources that can be used as test cases with known solutions to the R-D problem.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

2310.18908

Country:

Europe (0.29)
North America > United States > California (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

A Complete Recipe for Diffusion Generative Models

Pandey, Kushagra, Mandt, Stephan

arXiv.org Machine LearningOct-11-2023

Score-based Generative Models (SGMs) have demonstrated exceptional synthesis outcomes across various tasks. However, the current design landscape of the forward diffusion process remains largely untapped and often relies on physical heuristics or simplifying assumptions. Utilizing insights from the development of scalable Bayesian posterior samplers, we present a complete recipe for formulating forward processes in SGMs, ensuring convergence to the desired target distribution. Our approach reveals that several existing SGMs can be seen as specific manifestations of our framework. Building upon this method, we introduce Phase Space Langevin Diffusion (PSLD), which relies on score-based modeling within an augmented space enriched by auxiliary variables akin to physical phase space. Empirical results exhibit the superior sample quality and improved speed-quality trade-off of PSLD compared to various competing approaches on established image synthesis benchmarks. Remarkably, PSLD achieves sample quality akin to state-of-the-art SGMs (FID: 2.10 for unconditional CIFAR-10 generation). Lastly, we demonstrate the applicability of PSLD in conditional synthesis using pre-trained score networks, offering an appealing alternative as an SGM backbone for future advancements. Code and model checkpoints can be accessed at \url{https://github.com/mandt-lab/PSLD}.

artificial intelligence, machine learning, psld, (18 more...)

arXiv.org Machine Learning

2303.01748

Country:

North America > United States > California (0.14)
Asia > Middle East > Israel (0.14)

Genre: Research Report (1.00)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.61)

Add feedback

Efficient Integrators for Diffusion Generative Models

Pandey, Kushagra, Rudolph, Maja, Mandt, Stephan

arXiv.org Machine LearningOct-11-2023

Diffusion models suffer from slow sample generation at inference time. Therefore, developing a principled framework for fast deterministic/stochastic sampling for a broader class of diffusion models is a promising direction. We propose two complementary frameworks for accelerating sample generation in pre-trained models: Conjugate Integrators and Splitting Integrators. Conjugate integrators generalize DDIM, mapping the reverse diffusion dynamics to a more amenable space for sampling. In contrast, splitting-based integrators, commonly used in molecular dynamics, reduce the numerical simulation error by cleverly alternating between numerical updates involving the data and auxiliary variables. After extensively studying these methods empirically and theoretically, we present a hybrid method that leads to the best-reported performance for diffusion models in augmented spaces. Applied to Phase Space Langevin Diffusion [Pandey & Mandt, 2023] on CIFAR-10, our deterministic and stochastic samplers achieve FID scores of 2.11 and 2.36 in only 100 network function evaluations (NFE) as compared to 2.57 and 2.63 for the best-performing baselines, respectively. Our code and model checkpoints will be made publicly available at \url{https://github.com/mandt-lab/PSLD}.

artificial intelligence, machine learning, sampler, (16 more...)

arXiv.org Machine Learning

2310.07894

Country: North America > United States > California (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

An Introduction to Neural Data Compression

Yang, Yibo, Mandt, Stephan, Theis, Lucas

arXiv.org Artificial IntelligenceAug-16-2023

Neural compression is the application of neural networks and other machine learning methods to data compression. Recent advances in statistical machine learning have opened up new possibilities for data compression, allowing compression algorithms to be learned end-to-end from data using powerful generative models such as normalizing flows, variational autoencoders, diffusion probabilistic models, and generative adversarial networks. The present article aims to introduce this field of research to a broader machine learning audience by reviewing the necessary background in information theory (e.g., entropy coding, rate-distortion theory) and computer vision (e.g., image quality assessment, perceptual metrics), and providing a curated guide through the essential ideas and methods in the literature thus far.

artificial intelligence, machine learning, survey article, (17 more...)

arXiv.org Artificial Intelligence

2202.06533

Country:

Europe (1.00)
North America > United States > California (0.28)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Insights from Generative Modeling for Neural Video Compression

Yang, Ruihan, Yang, Yibo, Marino, Joseph, Mandt, Stephan

arXiv.org Artificial IntelligenceJul-9-2023

While recent machine learning research has revealed connections between deep generative models such as VAEs and rate-distortion losses used in learned compression, most of this work has focused on images. In a similar spirit, we view recently proposed neural video coding algorithms through the lens of deep autoregressive and latent variable modeling. We present these codecs as instances of a generalized stochastic temporal autoregressive transform, and propose new avenues for further improvements inspired by normalizing flows and structured priors. We propose several architectures that yield state-of-the-art video compression performance on high-resolution video and discuss their tradeoffs and ablations. In particular, we propose (i) improved temporal autoregressive transforms, (ii) improved entropy models with structured and temporal dependencies, and (iii) variable bitrate versions of our algorithms. Since our improvements are compatible with a large class of existing models, we provide further evidence that the generative modeling viewpoint can advance the neural video coding field.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2107.13136

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.95)
Information Technology > Artificial Intelligence > Vision (0.94)
(2 more...)

Add feedback

Deep Anomaly Detection under Labeling Budget Constraints

Li, Aodong, Qiu, Chen, Kloft, Marius, Smyth, Padhraic, Mandt, Stephan, Rudolph, Maja

arXiv.org Artificial IntelligenceJul-4-2023

Selecting informative data points for expert feedback can significantly improve the performance of anomaly detection (AD) in various contexts, such as medical diagnostics or fraud detection. In this paper, we determine a set of theoretical conditions under which anomaly scores generalize from labeled queries to unlabeled data. Motivated by these results, we propose a data labeling strategy with optimal data coverage under labeling budget constraints. In addition, we propose a new learning framework for semi-supervised AD. Extensive experiments on image, tabular, and video data sets show that our approach results in state-of-the-art semi-supervised AD performance under labeling budget constraints.

contamination ratio, data mining, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2302.07832

Country:

North America > United States > Hawaii (0.14)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Diagnostic Medicine (0.87)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Understanding Pathologies of Deep Heteroskedastic Regression

Wong-Toi, Eliot, Boyd, Alex, Fortuin, Vincent, Mandt, Stephan

arXiv.org Artificial IntelligenceJun-29-2023

Several recent studies have reported negative results when using heteroskedastic neural regression models to model real-world data. In particular, for overparameterized models, the mean and variance networks are powerful enough to either fit every single data point (while shrinking the predicted variances to zero), or to learn a constant prediction with an output variance exactly matching every predicted residual (i.e., explaining the targets as pure noise). This paper studies these difficulties from the perspective of statistical physics. We show that the observed instabilities are not specific to any neural network architecture but are already present in a field theory of an overparameterized conditional Gaussian likelihood model. Under light assumptions, we derive a nonparametric free energy that can be solved numerically. The resulting solutions show excellent qualitative agreement with empirical model fits on real-world data and, in particular, prove the existence of phase transitions, i.e., abrupt, qualitative differences in the behaviors of the regressors upon varying the regularization strengths on the two networks. Our work thus provides a theoretical explanation for the necessity to carefully regularize heteroskedastic regression models. Moreover, the insights from our theory suggest a scheme for optimizing this regularization which is quadratically more efficient than the naive approach.

artificial intelligence, bayesian inference, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2306.16717

Country:

Europe > Netherlands (0.28)
Europe > Germany (0.28)
North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Diagnostic Medicine (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

SC2 Benchmark: Supervised Compression for Split Computing

Matsubara, Yoshitomo, Yang, Ruihan, Levorato, Marco, Mandt, Stephan

arXiv.org Artificial IntelligenceJun-14-2023

With the increasing demand for deep learning models on mobile devices, splitting neural network computation between the device and a more powerful edge server has become an attractive solution. However, existing split computing approaches often underperform compared to a naive baseline of remote computation on compressed data. Recent studies propose learning compressed representations that contain more relevant information for supervised downstream tasks, showing improved tradeoffs between compressed data size and supervised performance. However, existing evaluation metrics only provide an incomplete picture of split computing. This study introduces supervised compression for split computing (SC2) and proposes new evaluation criteria: minimizing computation on the mobile device, minimizing transmitted data size, and maximizing model accuracy. We conduct a comprehensive benchmark study using 10 baseline methods, three computer vision tasks, and over 180 trained models, and discuss various aspects of SC2. We also release sc2bench, a Python package for future research on SC2. Our proposed metrics and package will help researchers better understand the tradeoffs of supervised compression in split computing.

artificial intelligence, deep learning, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2203.08875

Country: North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry: Energy (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Deep Anomaly Detection on Tennessee Eastman Process Data

Hartung, Fabian, Franks, Billy Joe, Michels, Tobias, Wagner, Dennis, Liznerski, Philipp, Reithermann, Steffen, Fellenz, Sophie, Jirasek, Fabian, Rudolph, Maja, Neider, Daniel, Leitte, Heike, Song, Chen, Kloepper, Benjamin, Mandt, Stephan, Bortz, Michael, Burger, Jakob, Hasse, Hans, Kloft, Marius

arXiv.org Artificial IntelligenceMar-10-2023

This paper provides the first comprehensive evaluation and analysis of modern (deep-learning) unsupervised anomaly detection methods for chemical process data. We focus on the Tennessee Eastman process dataset, which has been a standard litmus test to benchmark anomaly detection methods for nearly three decades. Our extensive study will facilitate choosing appropriate anomaly detection methods in industrial applications.

data mining, detection, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2303.05904

Country:

Europe (1.00)
North America > United States > Tennessee (0.61)

Genre: Research Report (0.50)

Industry:

Information Technology (0.68)
Materials > Chemicals (0.46)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback