cinn
CIKAN: Constraint Informed Kolmogorov-Arnold Networks for Autonomous Spacecraft Rendezvous using Time Shift Governor
Kim, Taehyeun, Girard, Anouck, Kolmanovsky, Ilya
The paper considers a Constrained-Informed Neural Network (CINN) approximation for the Time Shift Governor (TSG), which is an add-on scheme to the nominal closed-loop system used to enforce constraints by time-shifting the reference trajectory in spacecraft rendezvous applications. We incorporate Kolmogorov-Arnold Networks (KANs), an emerging architecture in the AI community, as a fundamental component of CINN and propose a Constrained-Informed Kolmogorov-Arnold Network (CIKAN)-based approximation for TSG. We demonstrate the effectiveness of the CIKAN-based TSG through simulations of constrained spacecraft rendezvous missions on highly elliptic orbits and present comparisons between CIKANs, MLP-based CINNs, and the conventional TSG.
Blood Glucose Control Via Pre-trained Counterfactual Invertible Neural Networks
Jiang, Jingchi, Shen, Rujia, Wang, Boran, Guan, Yi
Type 1 diabetes mellitus (T1D) is characterized by insulin deficiency and blood glucose (BG) control issues. The state-of-the-art solution for continuous BG control is reinforcement learning (RL), where an agent can dynamically adjust exogenous insulin doses in time to maintain BG levels within the target range. However, due to the lack of action guidance, the agent often needs to learn from randomized trials to understand misleading correlations between exogenous insulin doses and BG levels, which can lead to instability and unsafety. To address these challenges, we propose an introspective RL based on Counterfactual Invertible Neural Networks (CINN). We use the pre-trained CINN as a frozen introspective block of the RL agent, which integrates forward prediction and counterfactual inference to guide the policy updates, promoting more stable and safer BG control. Constructed based on interpretable causal order, CINN employs bidirectional encoders with affine coupling layers to ensure invertibility while using orthogonal weight normalization to enhance the trainability, thereby ensuring the bidirectional differentiability of network parameters. We experimentally validate the accuracy and generalization ability of the pre-trained CINN in BG prediction and counterfactual inference for action. Furthermore, our experimental results highlight the effectiveness of pre-trained CINN in guiding RL policy updates for more accurate and safer BG control.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Asia > China > Heilongjiang Province > Harbin (0.05)
- North America > Greenland (0.04)
- (3 more...)
- Research Report > Promising Solution (0.48)
- Research Report > New Finding (0.46)
- Research Report > Experimental Study (0.34)
Enhancing the Performance of Neural Networks Through Causal Discovery and Integration of Domain Knowledge
Zhang, Xiaoge, Wang, Xiao-Lin, Fan, Fenglei, Cheung, Yiu-Ming, Bose, Indranil
In this paper, we develop a generic methodology to encode hierarchical causality structure among observed variables into a neural network in order to improve its predictive performance. The proposed methodology, called causality-informed neural network (CINN), leverages three coherent steps to systematically map the structural causal knowledge into the layer-to-layer design of neural network while strictly preserving the orientation of every causal relationship. In the first step, CINN discovers causal relationships from observational data via directed acyclic graph (DAG) learning, where causal discovery is recast as a continuous optimization problem to avoid the combinatorial nature. In the second step, the discovered hierarchical causality structure among observed variables is systematically encoded into neural network through a dedicated architecture and customized loss function. By categorizing variables in the causal DAG as root, intermediate, and leaf nodes, the hierarchical causal DAG is translated into CINN with a one-to-one correspondence between nodes in the causal DAG and units in the CINN while maintaining the relative order among these nodes. Regarding the loss function, both intermediate and leaf nodes in the DAG graph are treated as target outputs during CINN training so as to drive co-learning of causal relationships among different types of nodes. As multiple loss components emerge in CINN, we leverage the projection of conflicting gradients to mitigate gradient interference among the multiple learning tasks. Computational experiments across a broad spectrum of UCI data sets demonstrate substantial advantages of CINN in predictive performance over other state-of-the-art methods. In addition, an ablation study underscores the value of integrating structural and quantitative causal knowledge in enhancing the neural network's predictive performance incrementally.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- South America > Paraguay > Asunción > Asunción (0.04)
- (4 more...)
- Workflow (1.00)
- Research Report > New Finding (0.46)
- Research Report > Promising Solution (0.34)
- Government (0.92)
- Banking & Finance (0.67)
- Information Technology (0.67)
- Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.67)
Improving Generative Model-based Unfolding with Schr\"{o}dinger Bridges
Diefenbacher, Sascha, Liu, Guan-Horng, Mikuni, Vinicius, Nachman, Benjamin, Nie, Weili
Machine learning-based unfolding has enabled unbinned and high-dimensional differential cross section measurements. Two main approaches have emerged in this research area: one based on discriminative models and one based on generative models. The main advantage of discriminative models is that they learn a small correction to a starting simulation while generative models scale better to regions of phase space with little data. We propose to use Schroedinger Bridges and diffusion models to create SBUnfold, an unfolding approach that combines the strengths of both discriminative and generative models. The key feature of SBUnfold is that its generative model maps one set of events into another without having to go through a known probability density as is the case for normalizing flows and standard diffusion models. We show that SBUnfold achieves excellent performance compared to state of the art methods on a synthetic Z+jets dataset.
- North America > United States > California > Alameda County > Berkeley (0.14)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
Application-driven Validation of Posteriors in Inverse Problems
Adler, Tim J., Nölke, Jan-Hinrich, Reinke, Annika, Tizabi, Minu Dietlinde, Gruber, Sebastian, Trofimova, Dasha, Ardizzone, Lynton, Jaeger, Paul F., Buettner, Florian, Köthe, Ullrich, Maier-Hein, Lena
Current deep learning-based solutions for image analysis tasks are commonly incapable of handling problems to which multiple different plausible solutions exist. In response, posterior-based methods such as conditional Diffusion Models and Invertible Neural Networks have emerged; however, their translation is hampered by a lack of research on adequate validation. In other words, the way progress is measured often does not reflect the needs of the driving practical application. Closing this gap in the literature, we present the first systematic framework for the application-driven validation of posterior-based methods in inverse problems. As a methodological novelty, it adopts key principles from the field of object detection validation, which has a long history of addressing the question of how to locate and match multiple object instances in an image. Treating modes as instances enables us to perform mode-centric validation, using well-interpretable metrics from the application perspective. We demonstrate the value of our framework through instantiations for a synthetic toy example and two medical vision use cases: pose estimation in surgery and imaging-based quantification of functional tissue parameters for diagnostics. Our framework offers key advantages over common approaches to posterior validation in all three examples and could thus revolutionize performance assessment in inverse problems.
- Europe > Germany > Hesse > Darmstadt Region > Frankfurt (0.04)
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Heidelberg (0.04)
- (10 more...)
- Health & Medicine > Therapeutic Area > Oncology (0.69)
- Health & Medicine > Diagnostic Medicine > Imaging (0.68)
Unsupervised Domain Transfer with Conditional Invertible Neural Networks
Dreher, Kris K., Ayala, Leonardo, Schellenberg, Melanie, Hübner, Marco, Nölke, Jan-Hinrich, Adler, Tim J., Seidlitz, Silvia, Sellner, Jan, Studier-Fischer, Alexander, Gröhl, Janek, Nickel, Felix, Köthe, Ullrich, Seitel, Alexander, Maier-Hein, Lena
Synthetic medical image generation has evolved as a key technique for neural network training and validation. A core challenge, however, remains in the domain gap between simulations and real data. While deep learning-based domain transfer using Cycle Generative Adversarial Networks and similar architectures has led to substantial progress in the field, there are use cases in which state-of-the-art approaches still fail to generate training images that produce convincing results on relevant downstream tasks. Here, we address this issue with a domain transfer approach based on conditional invertible neural networks (cINNs). As a particular advantage, our method inherently guarantees cycle consistency through its invertible architecture, and network training can efficiently be conducted with maximum likelihood training. To showcase our method's generic applicability, we apply it to two spectral imaging modalities at different scales, namely hyperspectral imaging (pixel-level) and photoacoustic tomography (image-level). According to comprehensive experiments, our method enables the generation of realistic spectral data and outperforms the state of the art on two downstream classification tasks (binary and multi-class).
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Heidelberg (0.05)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
Creating Probabilistic Forecasts from Arbitrary Deterministic Forecasts using Conditional Invertible Neural Networks
Phipps, Kaleb, Heidrich, Benedikt, Turowski, Marian, Wittig, Moritz, Mikut, Ralf, Hagenmeyer, Veit
In various applications, probabilistic forecasts are required to quantify the inherent uncertainty associated with the forecast. However, numerous modern forecasting methods are still designed to create deterministic forecasts. Transforming these deterministic forecasts into probabilistic forecasts is often challenging and based on numerous assumptions that may not hold in real-world situations. Therefore, the present article proposes a novel approach for creating probabilistic forecasts from arbitrary deterministic forecasts. In order to implement this approach, we use a conditional Invertible Neural Network (cINN). More specifically, we apply a cINN to learn the underlying distribution of the data and then combine the uncertainty from this distribution with an arbitrary deterministic forecast to generate accurate probabilistic forecasts. Our approach enables the simple creation of probabilistic forecasts without complicated statistical loss functions or further assumptions. Besides showing the mathematical validity of our approach, we empirically show that our approach noticeably outperforms traditional methods for including uncertainty in deterministic forecasts and generally outperforms state-of-the-art probabilistic forecasting benchmarks.
- North America > United States (0.14)
- Europe > Poland (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- (4 more...)
Characteristics-Informed Neural Networks for Forward and Inverse Hyperbolic Problems
We propose characteristics-informed neural networks (CINN), a simple and efficient machine learning approach for solving forward and inverse problems involving hyperbolic PDEs. Like physics-informed neural networks (PINN), CINN is a meshless machine learning solver with universal approximation capabilities. Unlike PINN, which enforces a PDE softly via a multi-part loss function, CINN encodes the characteristics of the PDE in a general-purpose deep neural network by adding a characteristic layer. This neural network is trained with the usual MSE data-fitting regression loss and does not require residual losses on collocation points. This leads to faster training and can avoid well-known pathologies of gradient descent optimization of multi-part PINN loss functions. This paper focuses on linear transport phenomena, in which case it is shown that, if the characteristic ODEs can be solved exactly, then the output of a CINN is an exact solution of the PDE, even at initialization, preventing the occurrence of non-physical solutions. In addition, a CINN can also be trained with soft penalty constraints that enforce, for example, periodic or Neumman boundary conditions, without losing the property that the output satisfies the PDE automatically. We also propose an architecture that extends the CINN approach to linear hyperbolic systems of PDEs. All CINN architectures proposed here can be trained end-to-end from sample data using standard deep learning software. Experiments with the simple advection equation, a stiff periodic advection equation, and an acoustics problem where data from one field is used to predict the other, unseen field, indicate that CINN is able to improve on the accuracy of the baseline PINN, in some cases by a considerable margin, while also being significantly faster to train and avoiding non-physical solutions. An extension to nonlinear PDEs is also briefly discussed.
- Europe (0.28)
- North America > United States (0.28)