Goto

Collaborating Authors

 Ren, Yihui


Generalizable Implicit Neural Representations via Parameterized Latent Dynamics for Baroclinic Ocean Forecasting

arXiv.org Artificial Intelligence

Published as a workshop paper at "Tackling Climate Change with Machine Learning", ICLR 2025 Mesoscale ocean dynamics play a critical role in climate systems, governing heat transport, hurricane genesis, and drought patterns. However, simulating these processes at high resolution remains computationally prohibitive due to their nonlinear, multiscale nature and vast spatiotemporal domains. Implicit neural representations (INRs) reduce the computational costs as resolution-independent surrogates but fail in many-query scenarios (inverse modeling) requiring rapid evaluations across diverse parameters. We present PINROD, a novel framework combining dynamics-aware implicit neural representations with parametrized neural ordinary differential equations to address these limitations. Experiments on ocean mesoscale activity data show superior accuracy over existing baselines and improved computational efficiency compared to standard numerical simulations.


Efficient Compression of Sparse Accelerator Data Using Implicit Neural Representations and Importance Sampling

arXiv.org Artificial Intelligence

High-energy, large-scale particle colliders in nuclear and high-energy physics generate data at extraordinary rates, reaching up to $1$ terabyte and several petabytes per second, respectively. The development of real-time, high-throughput data compression algorithms capable of reducing this data to manageable sizes for permanent storage is of paramount importance. A unique characteristic of the tracking detector data is the extreme sparsity of particle trajectories in space, with an occupancy rate ranging from approximately $10^{-6}$ to $10\%$. Furthermore, for downstream tasks, a continuous representation of this data is often more useful than a voxel-based, discrete representation due to the inherently continuous nature of the signals involved. To address these challenges, we propose a novel approach using implicit neural representations for data learning and compression. We also introduce an importance sampling technique to accelerate the network training process. Our method is competitive with traditional compression algorithms, such as MGARD, SZ, and ZFP, while offering significant speed-ups and maintaining negligible accuracy loss through our importance sampling strategy.


Variable Rate Neural Compression for Sparse Detector Data

arXiv.org Artificial Intelligence

High-energy large-scale particle colliders generate data at extraordinary rates. Developing real-time high-throughput data compression algorithms to reduce data volume and meet the bandwidth requirement for storage has become increasingly critical. Deep learning is a promising technology that can address this challenging topic. At the newly constructed sPHENIX experiment at the Relativistic Heavy Ion Collider, a Time Projection Chamber (TPC) serves as the main tracking detector, which records three-dimensional particle trajectories in a volume of a gas-filled cylinder. In terms of occupancy, the resulting data flow can be very sparse reaching $10^{-3}$ for proton-proton collisions. Such sparsity presents a challenge to conventional learning-free lossy compression algorithms, such as SZ, ZFP, and MGARD. In contrast, emerging deep learning-based models, particularly those utilizing convolutional neural networks for compression, have outperformed these conventional methods in terms of compression ratios and reconstruction accuracy. However, research on the efficacy of these deep learning models in handling sparse datasets, like those produced in particle colliders, remains limited. Furthermore, most deep learning models do not adapt their processing speeds to data sparsity, which affects efficiency. To address this issue, we propose a novel approach for TPC data compression via key-point identification facilitated by sparse convolution. Our proposed algorithm, BCAE-VS, achieves a $75\%$ improvement in reconstruction accuracy with a $10\%$ increase in compression ratio over the previous state-of-the-art model. Additionally, BCAE-VS manages to achieve these results with a model size over two orders of magnitude smaller. Lastly, we have experimentally verified that as sparsity increases, so does the model's throughput.


Mitigating Parameter Degeneracy using Joint Conditional Diffusion Model for WECC Composite Load Model in Power Systems

arXiv.org Artificial Intelligence

Data-driven modeling for dynamic systems has gained widespread attention in recent years. Its inverse formulation, parameter estimation, aims to infer the inherent model parameters from observations. However, parameter degeneracy, where different combinations of parameters yield the same observable output, poses a critical barrier to accurately and uniquely identifying model parameters. In the context of WECC composite load model (CLM) in power systems, utility practitioners have observed that CLM parameters carefully selected for one fault event may not perform satisfactorily in another fault. Here, we innovate a joint conditional diffusion model-based inverse problem solver (JCDI), that incorporates a joint conditioning architecture with simultaneous inputs of multi-event observations to improve parameter generalizability. Simulation studies on the WECC CLM show that the proposed JCDI effectively reduces uncertainties of degenerate parameters, thus the parameter estimation error is decreased by 42.1% compared to a single-event learning scheme. This enables the model to achieve high accuracy in predicting power trajectories under different fault events, including electronic load tripping and motor stalling, outperforming standard deep reinforcement learning and supervised learning approaches. We anticipate this work will contribute to mitigating parameter degeneracy in system dynamics, providing a general parameter estimation framework across various scientific domains.


Studying the Impact of Latent Representations in Implicit Neural Networks for Scientific Continuous Field Reconstruction

arXiv.org Artificial Intelligence

Learning a continuous and reliable representation of physical fields from sparse sampling is challenging and it affects diverse scientific disciplines. In a recent work, we present a novel model called MMGN (Multiplicative and Modulated Gabor Network) with implicit neural networks. In this work, we design additional studies leveraging explainability methods to complement the previous experiments and further enhance the understanding of latent representations generated by the model. The adopted methods are general enough to be leveraged for any latent space inspection. Preliminary results demonstrate the contextual information incorporated in the latent representations and their impact on the model performance. As a work in progress, we will continue to verify our findings and develop novel explainability approaches.


Continuous Field Reconstruction from Sparse Observations with Implicit Neural Networks

arXiv.org Artificial Intelligence

Reliably reconstructing physical fields from sparse sensor data is a challenge that frequently arises in many scientific domains. In practice, the process generating the data often is not understood to sufficient accuracy. Therefore, there is a growing interest in using the deep neural network route to address the problem. This work presents a novel approach that learns a continuous representation of the physical field using implicit neural representations (INRs). Specifically, after factorizing spatiotemporal variability into spatial and temporal components using the separation of variables technique, the method learns relevant basis functions from sparsely sampled irregular data points to develop a continuous representation of the data. In experimental evaluations, the proposed model outperforms recent INR methods, offering superior reconstruction quality on simulation data from a stateof-the-art climate model and a second dataset that comprises ultra-high resolution satellite-based sea surface temperature fields. Achieving accurate and comprehensive representation of complex physical fields is pivotal for tasks spanning system monitoring and control, analysis, and design. However, in a multitude of applications, encompassing geophysics (Reichstein et al., 2019), astronomy (Gabbard et al., 2022), biochemistry (Zhong et al., 2021), fluid mechanics (Deng et al., 2023), and others, using a sparse sensor network proves to be the most practical and effective solution. In meteorology and oceanography, variables such as atmospheric pressure, temperature, salinity/humidity, and wind/current velocity must be reconstructed from sparsely sampled observations. Currently, two distinct approaches are used to reconstruct full fields from sparse observations. Traditional physics model-based approaches are based on partial differential equations (PDEs). These approaches draw upon theoretical techniques to derive PDEs rooted in conservation laws and fundamental physical principles (Hughes, 2012). Yet, in complex systems such as weather (Brunton et al., 2016) and epidemiology (Massucci et al., 2016), deriving comprehensive models that are both sufficiently accurate and computationally efficient remains elusive.


Unpaired Image Translation to Mitigate Domain Shift in Liquid Argon Time Projection Chamber Detector Responses

arXiv.org Artificial Intelligence

Deep learning algorithms often are trained and deployed on different datasets. Any systematic difference between the training and a test dataset may degrade the algorithm performance--what is known as the domain shift problem. This issue is prevalent in many scientific domains where algorithms are trained on simulated data but applied to real-world datasets. Typically, the domain shift problem is solved through various domain adaptation methods. However, these methods are often tailored for a specific downstream task and may not easily generalize to different tasks. This work explores the feasibility of using an alternative way to solve the domain shift problem that is not specific to any downstream algorithm. The proposed approach relies on modern Unpaired Image-to-Image translation techniques, designed to find translations between different image domains in a fully unsupervised fashion. In this study, the approach is applied to a domain shift problem commonly encountered in Liquid Argon Time Projection Chamber (LArTPC) detector research when seeking a way to translate samples between two differently distributed detector datasets deterministically. This translation allows for mapping real-world data into the simulated data domain where the downstream algorithms can be run with much less domain-shift-related degradation. Conversely, using the translation from the simulated data in a real-world domain can increase the realism of the simulated dataset and reduce the magnitude of any systematic uncertainties. We adapted several UI2I translation algorithms to work on scientific data and demonstrated the viability of these techniques for solving the domain shift problem with LArTPC detector data. To facilitate further development of domain adaptation techniques for scientific datasets, the "Simple Liquid-Argon Track Samples" dataset used in this study also is published.


Fast 2D Bicephalous Convolutional Autoencoder for Compressing 3D Time Projection Chamber Data

arXiv.org Machine Learning

High-energy large-scale particle colliders produce data at high speed in the order of 1 terabytes per second in nuclear physics and petabytes per second in high-energy physics. Developing real-time data compression algorithms to reduce such data at high throughput to fit permanent storage has drawn increasing attention. Specifically, at the newly constructed sPHENIX experiment at the Relativistic Heavy Ion Collider (RHIC), a time projection chamber is used as the main tracking detector, which records particle trajectories in a volume of a three-dimensional (3D) cylinder. The resulting data are usually very sparse with occupancy around 10.8%. Such sparsity presents a challenge to conventional learning-free lossy compression algorithms, such as SZ, ZFP, and MGARD. The 3D convolutional neural network (CNN)-based approach, Bicephalous Convolutional Autoencoder (BCAE), outperforms traditional methods both in compression rate and reconstruction accuracy. BCAE can also utilize the computation power of graphical processing units suitable for deployment in a modern heterogeneous high-performance computing environment. This work introduces two BCAE variants: BCAE++ and BCAE-2D. BCAE++ achieves a 15% better compression ratio and a 77% better reconstruction accuracy measured in mean absolute error compared with BCAE. BCAE-2D treats the radial direction as the channel dimension of an image, resulting in a 3x speedup in compression throughput. In addition, we demonstrate an unbalanced autoencoder with a larger decoder can improve reconstruction accuracy without significantly sacrificing throughput. Lastly, we observe both the BCAE++ and BCAE-2D can benefit more from using half-precision mode in throughput (76-79% increase) without loss in reconstruction accuracy. The source code and links to data and pretrained models can be found at https://github.com/BNL-DAQ-LDRD/NeuralCompression_v2.


DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies

arXiv.org Artificial Intelligence

In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences. This could herald a new era of scientific exploration, bringing significant advancements across sectors from drug development to renewable energy. To answer this call, we present DeepSpeed4Science initiative (deepspeed4science.ai) which aims to build unique capabilities through AI system technology innovations to help domain experts to unlock today's biggest science mysteries. By leveraging DeepSpeed's current technology pillars (training, inference and compression) as base technology enablers, DeepSpeed4Science will create a new set of AI system technologies tailored for accelerating scientific discoveries by addressing their unique complexity beyond the common technical approaches used for accelerating generic large language models (LLMs). In this paper, we showcase the early progress we made with DeepSpeed4Science in addressing two of the critical system challenges in structural biology research.


Transferable Graph Neural Fingerprint Models for Quick Response to Future Bio-Threats

arXiv.org Artificial Intelligence

Fast screening of drug molecules based on the ligand binding affinity is an important step in the drug discovery pipeline. Graph neural fingerprint is a promising method for developing molecular docking surrogates with high throughput and great fidelity. In this study, we built a COVID-19 drug docking dataset of about 300,000 drug candidates on 23 coronavirus protein targets. With this dataset, we trained graph neural fingerprint docking models for high-throughput virtual COVID-19 drug screening. The graph neural fingerprint models yield high prediction accuracy on docking scores with the mean squared error lower than $0.21$ kcal/mol for most of the docking targets, showing significant improvement over conventional circular fingerprint methods. To make the neural fingerprints transferable for unknown targets, we also propose a transferable graph neural fingerprint method trained on multiple targets. With comparable accuracy to target-specific graph neural fingerprint models, the transferable model exhibits superb training and data efficiency. We highlight that the impact of this study extends beyond COVID-19 dataset, as our approach for fast virtual ligand screening can be easily adapted and integrated into a general machine learning-accelerated pipeline to battle future bio-threats.