AITopics | Steinbach, Michael

Collaborating Authors

Steinbach, Michael

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Prescribed Fire Modeling using Knowledge-Guided Machine Learning for Land Management

Chatterjee, Somya Sharma, Lindsay, Kelly, Chatterjee, Neel, Patil, Rohan, De Callafon, Ilkay Altintas, Steinbach, Michael, Giron, Daniel, Nguyen, Mai H., Kumar, Vipin

arXiv.org Artificial IntelligenceOct-2-2023

In recent years, the increasing threat of devastating wildfires has underscored the need for effective prescribed fire management. Process-based computer simulations have traditionally been employed to plan prescribed fires for wildfire prevention. However, even simplified process models like QUIC-Fire are too compute-intensive to be used for real-time decision-making, especially when weather conditions change rapidly. Traditional ML methods used for fire modeling offer computational speedup but struggle with physically inconsistent predictions, biased predictions due to class imbalance, biased estimates for fire spread metrics (e.g., burned area, rate of spread), and generalizability in out-of-distribution wind conditions. This paper introduces a novel machine learning (ML) framework that enables rapid emulation of prescribed fires while addressing these concerns. By incorporating domain knowledge, the proposed method helps reduce physical inconsistencies in fuel density estimates in data-scarce scenarios. To overcome the majority class bias in predictions, we leverage pre-existing source domain data to augment training data and learn the spread of fire more effectively. Finally, we overcome the problem of biased estimation of fire spread metrics by incorporating a hierarchical modeling structure to capture the interdependence in fuel density and burned area. Notably, improvement in fire metric (e.g., burned area) estimates offered by our framework makes it useful for fire managers, who often rely on these fire metric estimates to make decisions about prescribed burn management. Furthermore, our framework exhibits better generalization capabilities than the other ML-based fire modeling methods across diverse wind conditions and ignition patterns.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2310.01593

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Food & Agriculture > Agriculture (0.82)
Information Technology (0.67)
Government > Regional Government > North America Government > United States Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Realization of Causal Representation Learning to Adjust Confounding Bias in Latent Space

Li, Jia, Li, Xiang, Jia, Xiaowei, Steinbach, Michael, Kumar, Vipin

arXiv.org Artificial IntelligenceSep-22-2023

Causal DAGs(Directed Acyclic Graphs) are usually considered in a 2D plane. Edges indicate causal effects' directions and imply their corresponding time-passings. Due to the natural restriction of statistical models, effect estimation is usually approximated by averaging the individuals' correlations, i.e., observational changes over a specific time. However, in the context of Machine Learning on large-scale questions with complex DAGs, such slight biases can snowball to distort global models - More importantly, it has practically impeded the development of AI, for instance, the weak generalizability of causal models. In this paper, we redefine causal DAG as \emph{do-DAG}, in which variables' values are no longer time-stamp-dependent, and timelines can be seen as axes. By geometric explanation of multi-dimensional do-DAG, we identify the \emph{Causal Representation Bias} and its necessary factors, differentiated from common confounding biases. Accordingly, a DL(Deep Learning)-based framework will be proposed as the general solution, along with a realization method and experiments to verify its feasibility.

artificial intelligence, causal effect, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2211.08573

Genre: Research Report (0.64)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Mini-Batch Learning Strategies for modeling long term temporal dependencies: A study in environmental applications

Xu, Shaoming, Khandelwal, Ankush, Li, Xiang, Jia, Xiaowei, Liu, Licheng, Willard, Jared, Ghosh, Rahul, Cutler, Kelly, Steinbach, Michael, Duffy, Christopher, Nieber, John, Kumar, Vipin

arXiv.org Artificial IntelligenceFeb-1-2023

In many environmental applications, recurrent neural networks (RNNs) are often used to model physical variables with long temporal dependencies. However, due to mini-batch training, temporal relationships between training segments within the batch (intra-batch) as well as between batches (inter-batch) are not considered, which can lead to limited performance. Stateful RNNs aim to address this issue by passing hidden states between batches. Since Stateful RNNs ignore intra-batch temporal dependency, there exists a trade-off between training stability and capturing temporal dependency. In this paper, we provide a quantitative comparison of different Stateful RNN modeling strategies, and propose two strategies to enforce both intra- and inter-batch temporal dependency. First, we extend Stateful RNNs by defining a batch as a temporally ordered set of training segments, which enables intra-batch sharing of temporal information. While this approach significantly improves the performance, it leads to much larger training times due to highly sequential training. To address this issue, we further propose a new strategy which augments a training segment with an initial value of the target variable from the timestep right before the starting of the training segment. In other words, we provide an initial value of the target variable as additional input so that the network can focus on learning changes relative to that initial value. By using this strategy, samples can be passed in any order (mini-batch training) which significantly reduces the training time while maintaining the performance. In demonstrating our approach in hydrological modeling, we observe that the most significant gains in predictive accuracy occur when these methods are applied to state variables whose values change more slowly, such as soil water and snowpack, rather than continuously moving flux variables such as streamflow.

artificial intelligence, machine learning, temporal dependency, (17 more...)

arXiv.org Artificial Intelligence

2210.08347

Country: North America > United States (0.68)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Mining Novel Multivariate Relationships in Time Series Data Using Correlation Networks

Agrawal, Saurabh, Steinbach, Michael, Boley, Daniel, Chatterjee, Snigdhansu, Atluri, Gowtham, Dang, Anh The, Liess, Stefan, Kumar, Vipin

arXiv.org Machine LearningApr-23-2019

In many domains, there is significant interest in capturing novel relationships between time series that represent activities recorded at different nodes of a highly complex system. In this paper, we introduce multipoles, a novel class of linear relationships between more than two time series. A multipole is a set of time series that have strong linear dependence among themselves, with the requirement that each time series makes a significant contribution to the linear dependence. We demonstrate that most interesting multipoles can be identified as cliques of negative correlations in a correlation network. Such cliques are typically rare in a real-world correlation network, which allows us to find almost all multipoles efficiently using a clique-enumeration approach. Using our proposed framework, we demonstrate the utility of multipoles in discovering new physical phenomena in two scientific domains: climate science and neuroscience. In particular, we discovered several multipole relationships that are reproducible in multiple other independent datasets and lead to novel domain insights.

health & medicine, multipole, neurology, (19 more...)

arXiv.org Machine Learning

doi: 10.1109/TKDE.2019.2911681

1810.0295

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Physics Guided RNNs for Modeling Dynamical Systems: A Case Study in Simulating Lake Temperature Profiles

Jia, Xiaowei, Willard, Jared, Karpatne, Anuj, Read, Jordan, Zward, Jacob, Steinbach, Michael, Kumar, Vipin

arXiv.org Artificial IntelligenceOct-30-2018

This paper proposes a physics-guided recurrent neural network model (PGRNN) that combines RNNs and physics-based models to leverage their complementary strengths and improve the modeling of physical processes. Specifically, we show that a PGRNN can improve prediction accuracy over that of physical models, while generating outputs consistent with physical laws, and achieving good generalizability. Standard RNNs, even when producing superior prediction accuracy, often produce physically inconsistent results and lack generalizability. We further enhance this approach by using a pre-training method that leverages the simulated data from a physics-based model to address the scarcity of observed data. The PGRNN has the flexibility to incorporate additional physical constraints and we incorporate a density-depth relationship. Both enhancements further improve PGRNN performance. Although we present and evaluate this methodology in the context of modeling the dynamics of temperature in lakes, it is applicable more widely to a range of scientific and engineering disciplines where mechanistic (also known as process-based) models are used, e.g., power engineering, climate science, materials science, computational chemistry, and biomedicine.

constraint, deep learning, neural network, (18 more...)

arXiv.org Artificial Intelligence

1810.13075

Country: North America > United States (1.00)

Genre: Research Report (0.64)

Industry: Energy (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Theory-guided Data Science: A New Paradigm for Scientific Discovery from Data

Karpatne, Anuj, Atluri, Gowtham, Faghmous, James, Steinbach, Michael, Banerjee, Arindam, Ganguly, Auroop, Shekhar, Shashi, Samatova, Nagiza, Kumar, Vipin

arXiv.org Artificial IntelligenceNov-13-2017

Data science models, although successful in a number of commercial domains, have had limited applicability in scientific problems involving complex physical phenomena. Theory-guided data science (TGDS) is an emerging paradigm that aims to leverage the wealth of scientific knowledge for improving the effectiveness of data science models in enabling scientific discovery. The overarching vision of TGDS is to introduce scientific consistency as an essential component for learning generalizable models. Further, by producing scientifically interpretable models, TGDS aims to advance our scientific understanding by discovering novel domain insights. Indeed, the paradigm of TGDS has started to gain prominence in a number of scientific disciplines such as turbulence modeling, material discovery, quantum chemistry, bio-medical science, bio-marker discovery, climate science, and hydrology. In this paper, we formally conceptualize the paradigm of TGDS and present a taxonomy of research themes in TGDS. We describe several approaches for integrating domain knowledge in different research themes using illustrative examples from different disciplines. We also highlight some of the promising avenues of novel research for realizing the full potential of theory-guided data science.

knowledge, scientific discovery, upstream oil & gas, (31 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TKDE.2017.2720168

1612.08544

Country: North America > United States > Texas > Travis County > Austin (0.14)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.93)
(3 more...)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(4 more...)

Add feedback