AITopics

2312.04494

Country: North America > United States (0.46)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Nuclear Medicine (0.48)
Health & Medicine > Diagnostic Medicine > Imaging (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

arXiv.org Artificial IntelligenceOct-24-2023

"Understanding Robustness Lottery": A Geometric Visual Comparative Analysis of Neural Network Pruning Approaches

Li, Zhimin, Liu, Shusen, Yu, Xin, Bhavya, Kailkhura, Cao, Jie, Daniel, Diffenderfer James, Bremer, Peer-Timo, Pascucci, Valerio

Deep learning approaches have provided state-of-the-art performance in many applications by relying on large and overparameterized neural networks. However, such networks have been shown to be very brittle and are difficult to deploy on resource-limited platforms. Model pruning, i.e., reducing the size of the network, is a widely adopted strategy that can lead to a more robust and compact model. Many heuristics exist for model pruning, but empirical studies show that some heuristics improve performance whereas others can make models more brittle or have other side effects. This work aims to shed light on how different pruning methods alter the network's internal feature representation and the corresponding impact on model performance. To facilitate a comprehensive comparison and characterization of the high-dimensional model feature space, we introduce a visual geometric analysis of feature representations. We decomposed and evaluated a set of critical geometric concepts from the common adopted classification loss, and used them to design a visualization system to compare and highlight the impact of pruning on model performance and feature representation. The proposed tool provides an environment for in-depth comparison of pruning methods and a comprehensive understanding of how model response to common data corruption. By leveraging the proposed visualization, machine learning researchers can reveal the similarities between pruning methods and redundant in robustness evaluation benchmarks, obtain geometric insights about the differences between pruned models that achieve superior robustness performance, and identify samples that are robust or fragile to model pruning and common data corruption to model pruning and data corruption but also obtain insights and explanations on how some pruned models achieve superior robustness performance.

artificial intelligence, geometric visual comparative analysis, machine learning, (3 more...)

2206.07918

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

arXiv.org Artificial IntelligenceMay-2-2023

Cross-GAN Auditing: Unsupervised Identification of Attribute Level Similarities and Differences between Pretrained Generative Models

Olson, Matthew L., Liu, Shusen, Anirudh, Rushil, Thiagarajan, Jayaraman J., Bremer, Peer-Timo, Wong, Weng-Keen

Generative Adversarial Networks (GANs) are notoriously difficult to train especially for complex distributions and with limited data. This has driven the need for tools to audit trained networks in human intelligible format, for example, to identify biases or ensure fairness. Existing GAN audit tools are restricted to coarse-grained, model-data comparisons based on summary statistics such as FID or recall. In this paper, we propose an alternative approach that compares a newly developed GAN against a prior baseline. To this end, we introduce Cross-GAN Auditing (xGA) that, given an established "reference" GAN and a newly proposed "client" GAN, jointly identifies intelligible attributes that are either common across both GANs, novel to the client GAN, or missing from the client GAN. This provides both users and model developers an intuitive assessment of similarity and differences between GANs. We introduce novel metrics to evaluate attribute-based GAN auditing approaches and use these metrics to demonstrate quantitatively that xGA outperforms baseline approaches. We also include qualitative results that illustrate the common, novel and missing attributes identified by xGA from GANs trained on a variety of image datasets.

artificial intelligence, glasses, machine learning, (15 more...)

2303.10774

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.68)
Information Technology (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceDec-1-2022

Single Model Uncertainty Estimation via Stochastic Data Centering

Thiagarajan, Jayaraman J., Anirudh, Rushil, Narayanaswamy, Vivek, Bremer, Peer-Timo

We are interested in estimating the uncertainties of deep neural networks, which play an important role in many scientific and engineering problems. In this paper, we present a striking new finding that an ensemble of neural networks with the same weight initialization, trained on datasets that are shifted by a constant bias gives rise to slightly inconsistent trained models, where the differences in predictions are a strong indicator of epistemic uncertainties. Using the neural tangent kernel (NTK), we demonstrate that this phenomena occurs in part because the NTK is not shift-invariant. Since this is achieved via a trivial input transformation, we show that this behavior can therefore be approximated by training a single neural network -- using a technique that we call $\Delta-$UQ -- that estimates uncertainty around prediction by marginalizing out the effect of the biases during inference. We show that $\Delta-$UQ's uncertainty estimates are superior to many of the current methods on a variety of benchmarks -- outlier rejection, calibration under distribution shift, and sequential design optimization of black box functions. Code for $\Delta-$UQ can be accessed at https://github.com/LLNL/DeltaUQ

anchor, artificial intelligence, machine learning, (18 more...)

2207.07235

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.88)

Industry: Energy (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

arXiv.org Artificial IntelligenceJul-13-2022

Identifying Orientation-specific Lipid-protein Fingerprints using Deep Learning

Aydin, Fikret, Georgouli, Konstantia, Dharuman, Gautham, Glosli, James N., Lightstone, Felice C., Ingólfsson, Helgi I., Bremer, Peer-Timo, Bhatia, Harsh

Improved understanding of the relation between the behavior of RAS and RAF proteins and the local lipid environment in the cell membrane is critical for getting insights into the mechanisms underlying cancer formation. In this work, we employ deep learning (DL) to learn this relationship by predicting protein orientational states of RAS and RAS-RAF protein complexes with respect to the lipid membrane based on the lipid densities around the protein domains from coarse-grained (CG) molecular dynamics (MD) simulations. Our DL model can predict six protein states with an overall accuracy of over 80%. The findings of this work offer new insights into how the proteins modulate the lipid environment, which in turn may assist designing novel therapies to regulate such interactions in the mechanisms associated with cancer development.

artificial intelligence, machine learning, orientational state, (14 more...)

2207.0663

Country: North America > United States (1.00)

Genre: Research Report (0.83)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)

arXiv.org Machine LearningOct-26-2020

Meaningful uncertainties from deep neural network surrogates of large-scale numerical simulations

Anderson, Gemma J., Gaffney, Jim A., Spears, Brian K., Bremer, Peer-Timo, Anirudh, Rushil, Thiagarajan, Jayaraman J.

Large-scale numerical simulations are used across many scientific disciplines to facilitate experimental development and provide insights into underlying physical processes, but they come with a significant computational cost. Deep neural networks (DNNs) can serve as highly-accurate surrogate models, with the capacity to handle diverse datatypes, offering tremendous speed-ups for prediction and many other downstream tasks. An important use-case for these surrogates is the comparison between simulations and experiments; prediction uncertainty estimates are crucial for making such comparisons meaningful, yet standard DNNs do not provide them. In this work we define the fundamental requirements for a DNN to be useful for scientific applications, and demonstrate a general variational inference approach to equip predictions of scalar and image data from a DNN surrogate model trained on inertial confinement fusion simulations with calibrated Bayesian uncertainties. Critically, these uncertainties are interpretable, meaningful and preserve physics-correlations in the predicted quantities.

deep learning, neural network, prediction, (20 more...)

2010.13749

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (0.70)
Energy (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Machine LearningSep-30-2020

Accurate and Robust Feature Importance Estimation under Distribution Shifts

Thiagarajan, Jayaraman J., Narayanaswamy, Vivek, Anirudh, Rushil, Bremer, Peer-Timo, Spanias, Andreas

With increasing reliance on the outcomes of black-box models in critical applications, post-hoc explainability tools that do not require access to the model internals are often used to enable humans understand and trust these models. In particular, we focus on the class of methods that can reveal the influence of input features on the predicted outputs. Despite their wide-spread adoption, existing methods are known to suffer from one or more of the following challenges: computational complexities, large uncertainties and most importantly, inability to handle real-world domain shifts. In this paper, we propose PRoFILE, a novel feature importance estimation method that addresses all these challenges. Through the use of a loss estimator jointly trained with the predictive model and a causal objective, PRoFILE can accurately estimate the feature importance scores even under complex distribution shifts, without any additional re-training. To this end, we also develop learning strategies for training the loss estimator, namely contrastive and dropout calibration, and find that it can effectively detect distribution shifts. Using empirical studies on several benchmark image and non-image data, we show significant improvements over state-of-the-art approaches, both in terms of fidelity and robustness.

artificial intelligence, explanation, neural network, (19 more...)

2009.14454

Country: North America > United States > Arizona (0.14)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry: Government > Regional Government (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

arXiv.org Machine LearningMay-5-2020

Designing Accurate Emulators for Scientific Processes using Calibration-Driven Deep Models

Thiagarajan, Jayaraman J., Venkatesh, Bindya, Anirudh, Rushil, Bremer, Peer-Timo, Gaffney, Jim, Anderson, Gemma, Spears, Brian

Predictive models that accurately emulate complex scientific processes can achieve exponential speed-ups over numerical simulators or experiments, and at the same time provide surrogates for improving the subsequent analysis. Consequently, there is a recent surge in utilizing modern machine learning (ML) methods, such as deep neural networks, to build data-driven emulators. While the majority of existing efforts has focused on tailoring off-the-shelf ML solutions to better suit the scientific problem at hand, we study an often overlooked, yet important, problem of choosing loss functions to measure the discrepancy between observed data and the predictions from a model. Due to lack of better priors on the expected residual structure, in practice, simple choices such as the mean squared error and the mean absolute error are made. However, the inherent symmetric noise assumption made by these loss functions makes them inappropriate in cases where the data is heterogeneous or when the noise distribution is asymmetric. We propose Learn-by-Calibrating (LbC), a novel deep learning approach based on interval calibration for designing emulators in scientific applications, that are effective even with heterogeneous data and are robust to outliers. Using a large suite of use-cases, we show that LbC provides significant improvements in generalization error over widely-adopted loss function choices, achieves high-quality emulators even in small data regimes and more importantly, recovers the inherent noise structure without any explicit priors.

deep learning, loss function, upstream oil & gas, (17 more...)

2005.02328

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Energy > Oil & Gas > Upstream (1.00)
Energy > Power Industry (0.69)
Government > Regional Government > North America Government > United States Government (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

arXiv.org Machine LearningOct-3-2019

Exploring Generative Physics Models with Scientific Priors in Inertial Confinement Fusion

Anirudh, Rushil, Thiagarajan, Jayaraman J., Liu, Shusen, Bremer, Peer-Timo, Spears, Brian K.

There is significant interest in using modern neural networks for scientific applications due to their effectiveness in modeling highly complex, non-linear problems in a data-driven fashion. However, a common challenge is to verify the scientific plausibility or validity of outputs predicted by a neural network. This work advocates the use of known scientific constraints as a lens into evaluating, exploring, and understanding such predictions for the problem of inertial confinement fusion.

constraint, neural network, us government, (15 more...)

1910.01666

Country: North America > United States > California (0.14)

Genre: Research Report (0.40)

Industry: Government > Regional Government > North America Government > United States Government (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

arXiv.org Machine LearningSep-25-2019

Function Preserving Projection for Scalable Exploration of High-Dimensional Data

Liu, Shusen, Anirudh, Rushil, Thiagarajan, Jayaraman J., Bremer, Peer-Timo

We present function preserving projections (FPP), a scalable linear projection technique for discovering interpretable relationships in high-dimensional data. Conventional dimension reduction methods aim to maximally preserve the global and/or local geometric structure of a dataset. However, in practice one is often more interested in determining how one or multiple user-selected response function(s) can be explained by the data. To intuitively connect the responses to the data, FPP constructs 2D linear embeddings optimized to reveal interpretable yet potentially non-linear patterns of the response functions. More specifically, FPP is designed to (i) produce human-interpretable embeddings; (ii) capture non-linear relationships; (iii) allow the simultaneous use of multiple response functions; and (iv) scale to millions of samples. Using FPP on real-world datasets, one can obtain fundamentally new insights about high-dimensional relationships in large-scale data that could not be achieved using existing dimension reduction methods.

artificial intelligence, machine learning, projection, (19 more...)

1909.11804

Country: North America > United States (0.46)

Genre: Research Report > Experimental Study (0.30)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)