Goto

Collaborating Authors

 Jia, Xiaowei


Physics-Guided Foundation Model for Scientific Discovery: An Application to Aquatic Science

arXiv.org Artificial Intelligence

Physics-guided machine learning (PGML) has become a prevalent approach in studying scientific systems due to its ability to integrate scientific theories for enhancing machine learning (ML) models. However, most PGML approaches are tailored to isolated and relatively simple tasks, which limits their applicability to complex systems involving multiple interacting processes and numerous influencing features. In this paper, we propose a P hysics-G uided Foundation Model (PGFM) that combines pre-trained ML models and physics-based models and leverages their complementary strengths to improve the modeling of multiple coupled processes. To effectively conduct pre-training, we construct a simulated environmental system that encompasses a wide range of influencing features and various simulated variables generated by physics-based models. The model is pre-trained in this system to adaptively select important feature interactions guided by multi-task objectives. We then fine-tune the model for each specific task using true observations, while maintaining consistency with established physical theories, such as the principles of mass and energy conservation. We demonstrate the effectiveness of this methodology in modeling water temperature and dissolved oxygen dynamics in real-world lakes. The proposed PGFM is also broadly applicable to a range of scientific fields where physics-based models are being used.


Modeling Continuous Spatial-temporal Dynamics of Turbulent Flow with Test-time Refinement

arXiv.org Artificial Intelligence

The precise simulation of turbulent flows holds immense significance across various scientific and engineering domains, including climate science, freshwater science, and energy-efficient manufacturing. Within the realm of simulating turbulent flows, large eddy simulation (LES) has emerged as a prevalent alternative to direct numerical simulation (DNS), offering computational efficiency. However, LES cannot accurately capture the full spectrum of turbulent transport scales and is present only at a lower spatial resolution. Reconstructing high-fidelity DNS data from the lower-resolution LES data is essential for numerous applications, but it poses significant challenges to existing super-resolution techniques, primarily due to the complex spatio-temporal nature of turbulent flows. This paper proposes a novel flow reconstruction approach that leverages physical knowledge to model flow dynamics. Different from traditional super-resolution techniques, the proposed approach uses LES data only in the testing phase through a degradation-based refinement approach to enforce physical constraints and mitigate cumulative reconstruction errors over time. Furthermore, a feature sampling strategy is developed to enable flow data reconstruction across different resolutions. The results on two distinct sets of turbulent flow data indicate the effectiveness of the proposed method in reconstructing high-resolution DNS data, preserving the inherent physical attributes of flow transport, and achieving DNS reconstruction at different resolutions.


Physics-Guided Fair Graph Sampling for Water Temperature Prediction in River Networks

arXiv.org Machine Learning

This work introduces a novel graph neural networks (GNNs)-based method to predict stream water temperature and reduce model bias across locations of different income and education levels. Traditional physics-based models often have limited accuracy because they are necessarily approximations of reality. Recently, there has been an increasing interest of using GNNs in modeling complex water dynamics in stream networks. Despite their promise in improving the accuracy, GNNs can bring additional model bias through the aggregation process, where node features are updated by aggregating neighboring nodes. The bias can be especially pronounced when nodes with similar sensitive attributes are frequently connected. We introduce a new method that leverages physical knowledge to represent the node influence in GNNs, and then utilizes physics-based influence to refine the selection and weights over the neighbors. The objective is to facilitate equitable treatment over different sensitive groups in the graph aggregation, which helps reduce spatial bias over locations, especially for those in underprivileged groups. The results on the Delaware River Basin demonstrate the effectiveness of the proposed method in preserving equitable performance across locations in different sensitive groups.


Adaptive Process-Guided Learning: An Application in Predicting Lake DO Concentrations

arXiv.org Artificial Intelligence

This paper introduces a \textit{Process-Guided Learning (Pril)} framework that integrates physical models with recurrent neural networks (RNNs) to enhance the prediction of dissolved oxygen (DO) concentrations in lakes, which is crucial for sustaining water quality and ecosystem health. Unlike traditional RNNs, which may deliver high accuracy but often lack physical consistency and broad applicability, the \textit{Pril} method incorporates differential DO equations for each lake layer, modeling it as a first-order linear solution using a forward Euler scheme with a daily timestep. However, this method is sensitive to numerical instabilities. When drastic fluctuations occur, the numerical integration is neither mass-conservative nor stable. Especially during stratified conditions, exogenous fluxes into each layer cause significant within-day changes in DO concentrations. To address this challenge, we further propose an \textit{Adaptive Process-Guided Learning (April)} model, which dynamically adjusts timesteps from daily to sub-daily intervals with the aim of mitigating the discrepancies caused by variations in entrainment fluxes. \textit{April} uses a generator-discriminator architecture to identify days with significant DO fluctuations and employs a multi-step Euler scheme with sub-daily timesteps to effectively manage these variations. We have tested our methods on a wide range of lakes in the Midwestern USA, and demonstrated robust capability in predicting DO concentrations even with limited training data. While primarily focused on aquatic ecosystems, this approach is broadly applicable to diverse scientific and engineering disciplines that utilize process-based models, such as power engineering, climate science, and biomedicine.


Hierarchical Conditional Multi-Task Learning for Streamflow Modeling

arXiv.org Artificial Intelligence

Streamflow, vital for water resource management, is governed by complex hydrological systems involving intermediate processes driven by meteorological forces. While deep learning models have achieved state-of-the-art results of streamflow prediction, their end-to-end single-task learning approach often fails to capture the causal relationships within these systems. To address this, we propose Hierarchical Conditional Multi-Task Learning (HCMTL), a hierarchical approach that jointly models soil water and snowpack processes based on their causal connections to streamflow. HCMTL utilizes task embeddings to connect network modules, enhancing flexibility and expressiveness while capturing unobserved processes beyond soil water and snowpack. It also incorporates the Conditional Mini-Batch strategy to improve long time series modeling. We compare HCMTL with five baselines on a global dataset. HCMTL's superior performance across hundreds of drainage basins over extended periods shows that integrating domain-specific causal knowledge into deep learning enhances both prediction accuracy and interpretability. This is essential for advancing our understanding of complex hydrological systems and supporting efficient water resource management to mitigate natural disasters like droughts and floods.


ExoTST: Exogenous-Aware Temporal Sequence Transformer for Time Series Prediction

arXiv.org Artificial Intelligence

Accurate long-term predictions are the foundations for many machine learning applications and decision-making processes. Traditional time series approaches for prediction often focus on either autoregressive modeling, which relies solely on past observations of the target ``endogenous variables'', or forward modeling, which considers only current covariate drivers ``exogenous variables''. However, effectively integrating past endogenous and past exogenous with current exogenous variables remains a significant challenge. In this paper, we propose ExoTST, a novel transformer-based framework that effectively incorporates current exogenous variables alongside past context for improved time series prediction. To integrate exogenous information efficiently, ExoTST leverages the strengths of attention mechanisms and introduces a novel cross-temporal modality fusion module. This module enables the model to jointly learn from both past and current exogenous series, treating them as distinct modalities. By considering these series separately, ExoTST provides robustness and flexibility in handling data uncertainties that arise from the inherent distribution shift between historical and current exogenous variables. Extensive experiments on real-world carbon flux datasets and time series benchmarks demonstrate ExoTST's superior performance compared to state-of-the-art baselines, with improvements of up to 10\% in prediction accuracy. Moreover, ExoTST exhibits strong robustness against missing values and noise in exogenous drivers, maintaining consistent performance in real-world situations where these imperfections are common.


Physics-enhanced Neural Operator for Simulating Turbulent Transport

arXiv.org Artificial Intelligence

The precise simulation of turbulent flows is of immense importance in a variety of scientific and engineering fields, including climate science, freshwater science, and the development of energy-efficient manufacturing processes. Within the realm of turbulent flow simulation, direct numerical simulation (DNS) is widely considered to be the most reliable approach, but it is prohibitively expensive for long-term simulation at fine spatial scales. Given the pressing need for efficient simulation, there is an increasing interest in building machine learning models for turbulence, either by reconstructing DNS from alternative low-fidelity simulations or by predicting DNS based on the patterns learned from historical data. However, standard machine learning techniques remain limited in capturing complex spatio-temporal characteristics of turbulent flows, resulting in limited performance and generalizability. This paper presents a novel physics-enhanced neural operator (PENO) that incorporates physical knowledge of partial differential equations (PDEs) to accurately model flow dynamics. The model is further refined by a self-augmentation mechanism to reduce the accumulated error in long-term simulations. The proposed method is evaluated through its performance on two distinct sets of 3D turbulent flow data, showcasing the model's capability to reconstruct high-resolution DNS data, maintain the inherent physical properties of flow transport, and generate flow simulations across various resolutions. Additionally, experimental results on multiple 2D vorticity flow series, generated by different PDEs, highlight the transferability and generalizability of the proposed method. This confirms its applicability to a wide range of real-world scenarios in which extensive simulations are needed under diverse settings.


Knowledge-guided Machine Learning: Current Trends and Future Prospects

arXiv.org Artificial Intelligence

This is especially true in environmental sciences that are rapidly transitioning from being data-poor to data-rich, e.g., with the ever-increasing volumes of environmental data being collected by Earth observing satellites, in-situ sensors, and those generated by model simulations (e.g., climate model runs [113]). Similar to how recent developments in ML has transformed how we interact with the information on the Internet, it is befitting to ask how ML advances can enable Earth system scientists to transform a fundamental goal in science, which is to build better models of physical, biological, and environmental systems. The conventional approach for modeling relationships between input drivers and response variables is to use process-based models rooted in scientific equations. Despite their ability to leverage the mechanistic understanding of scientific phenomena, process-based models suffer from several shortcomings limiting their adoption in complex real-world settings, e.g., due to imperfections in model formulations (or modeling bias), incorrect choices of parameter values in equations, and high computational costs in running high-fidelity simulations. In response to these challenges, ML methods offer a promising alternative to capture statistical relationships between inputs and outputs directly from data. However, "black-box" ML models, that solely rely on the supervision contained in data, show limited generalizability in scientific problems, especially when applied to out-of-distribution data. One of the reasons for this lack of generalizability is the limited scale of data in scientific disciplines in contrast to mainstream applications of AI and ML where large-scale datasets in computer vision and natural language modeling have been instrumental in the success of state-of-the-art AI/ML models. Another fundamental deficiency in black-box ML models is their tendency to produce results that are inconsistent with existing scientific theories and their inability to provide a mechanistic understanding of discovered patterns and relationships from data, limiting their usefulness in science.


When are Foundation Models Effective? Understanding the Suitability for Pixel-Level Classification Using Multispectral Imagery

arXiv.org Artificial Intelligence

Foundation models, i.e., very large deep learning models, have demonstrated impressive performances in various language and vision tasks that are otherwise difficult to reach using smaller-size models. The major success of GPT-type of language models is particularly exciting and raises expectations on the potential of foundation models in other domains including satellite remote sensing. In this context, great efforts have been made to build foundation models to test their capabilities in broader applications, and examples include Prithvi by NASA-IBM, Segment-Anything-Model, ViT, etc. This leads to an important question: Are foundation models always a suitable choice for different remote sensing tasks, and when or when not? This work aims to enhance the understanding of the status and suitability of foundation models for pixel-level classification using multispectral imagery at moderate resolution, through comparisons with traditional machine learning (ML) and regular-size deep learning models. Interestingly, the results reveal that in many scenarios traditional ML models still have similar or better performance compared to foundation models, especially for tasks where texture is less useful for classification. On the other hand, deep learning models did show more promising results for tasks where labels partially depend on texture (e.g., burn scar), while the difference in performance between foundation models and deep learning models is not obvious. The results conform with our analysis: The suitability of foundation models depend on the alignment between the self-supervised learning tasks and the real downstream tasks, and the typical masked autoencoder paradigm is not necessarily suitable for many remote sensing problems.


LITE: Modeling Environmental Ecosystems with Multimodal Large Language Models

arXiv.org Artificial Intelligence

The modeling of environmental ecosystems plays a pivotal role in the sustainable management of our planet. Accurate prediction of key environmental variables over space and time can aid in informed policy and decision-making, thus improving people's livelihood. Recently, deep learning-based methods have shown promise in modeling the spatial-temporal relationships for predicting environmental variables. However, these approaches often fall short in handling incomplete features and distribution shifts, which are commonly observed in environmental data due to the substantial cost of data collection and malfunctions in measuring instruments. To address these issues, we propose LITE -- a multimodal large language model for environmental ecosystems modeling. Specifically, LITE unifies different environmental variables by transforming them into natural language descriptions and line graph images. Then, LITE utilizes unified encoders to capture spatial-temporal dynamics and correlations in different modalities. During this step, the incomplete features are imputed by a sparse Mixture-of-Experts framework, and the distribution shift is handled by incorporating multi-granularity information from past observations. Finally, guided by domain instructions, a language model is employed to fuse the multimodal representations for the prediction. Our experiments demonstrate that LITE significantly enhances performance in environmental spatial-temporal prediction across different domains compared to the best baseline, with a 41.25% reduction in prediction error. This justifies its effectiveness. Our data and code are available at https://github.com/hrlics/LITE.