AITopics | Gaffney, Jim

Collaborating Authors

Gaffney, Jim

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Designing Accurate Emulators for Scientific Processes using Calibration-Driven Deep Models

Thiagarajan, Jayaraman J., Venkatesh, Bindya, Anirudh, Rushil, Bremer, Peer-Timo, Gaffney, Jim, Anderson, Gemma, Spears, Brian

arXiv.org Machine LearningMay-5-2020

Predictive models that accurately emulate complex scientific processes can achieve exponential speed-ups over numerical simulators or experiments, and at the same time provide surrogates for improving the subsequent analysis. Consequently, there is a recent surge in utilizing modern machine learning (ML) methods, such as deep neural networks, to build data-driven emulators. While the majority of existing efforts has focused on tailoring off-the-shelf ML solutions to better suit the scientific problem at hand, we study an often overlooked, yet important, problem of choosing loss functions to measure the discrepancy between observed data and the predictions from a model. Due to lack of better priors on the expected residual structure, in practice, simple choices such as the mean squared error and the mean absolute error are made. However, the inherent symmetric noise assumption made by these loss functions makes them inappropriate in cases where the data is heterogeneous or when the noise distribution is asymmetric. We propose Learn-by-Calibrating (LbC), a novel deep learning approach based on interval calibration for designing emulators in scientific applications, that are effective even with heterogeneous data and are robust to outliers. Using a large suite of use-cases, we show that LbC provides significant improvements in generalization error over widely-adopted loss function choices, achieves high-quality emulators even in small data regimes and more importantly, recovers the inherent noise structure without any explicit priors.

deep learning, loss function, upstream oil & gas, (17 more...)

arXiv.org Machine Learning

2005.02328

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Energy > Oil & Gas > Upstream (1.00)
Energy > Power Industry (0.69)
Government > Regional Government > North America Government > United States Government (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback

Scalable Topological Data Analysis and Visualization for Evaluating Data-Driven Models in Scientific Applications

Liu, Shusen, Wang, Di, Maljovec, Dan, Anirudh, Rushil, Thiagarajan, Jayaraman J., Jacobs, Sam Ade, Van Essen, Brian C., Hysom, David, Yeom, Jae-Seung, Gaffney, Jim, Peterson, Luc, Robinson, Peter B., Bhatia, Harsh, Pascucci, Valerio, Spears, Brian K., Bremer, Peer-Timo

arXiv.org Machine LearningJul-18-2019

With the rapid adoption of machine learning techniques for large-scale applications in science and engineering comes the convergence of two grand challenges in visualization. First, the utilization of black box models (e.g., deep neural networks) calls for advanced techniques in exploring and interpreting model behaviors. Second, the rapid growth in computing has produced enormous datasets that require techniques that can handle millions or more samples. Although some solutions to these interpretability challenges have been proposed, they typically do not scale beyond thousands of samples, nor do they provide the high-level intuition scientists are looking for. Here, we present the first scalable solution to explore and analyze high-dimensional functions often encountered in the scientific data analysis pipeline. By combining a new streaming neighborhood graph construction, the corresponding topology computation, and a novel data aggregation scheme, namely topology aware datacubes, we enable interactive exploration of both the topological and the geometric aspect of high-dimensional data. Following two use cases from high-energy-density (HED) physics and computational biology, we demonstrate how these capabilities have led to crucial new insights in both applications.

application, deep learning, neural network, (20 more...)

arXiv.org Machine Learning

1907.08325

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback