dde
PBPK-iPINNs: Inverse Physics-Informed Neural Networks for Physiologically Based Pharmacokinetic Brain Models
Wickramasinghe, Charuka D., Weerasinghe, Krishanthi C., Ranaweera, Pradeep K.
Physics-Informed Neural Networks (PINNs) leverage machine learning with differential equations to solve direct and inverse problems, ensuring predictions follow physical laws. Physiologically based pharmacokinetic (PBPK) modeling advances beyond classical compartmental approaches by using a mechanistic, physiology focused framework. A PBPK model is based on a system of ODEs, with each equation representing the mass balance of a drug in a compartment, such as an organ or tissue. These ODEs include parameters that reflect physiological, biochemical, and drug-specific characteristics to simulate how the drug moves through the body. In this paper, we introduce PBPK-iPINN, a method to estimate drug-specific or patient-specific parameters and drug concentration profiles in PBPK brain compartment models using inverse PINNs. We demonstrate that, for the inverse problem to converge to the correct solution, the loss function components (data loss, initial conditions loss, and residual loss) must be appropriately weighted, and parameters (including number of layers, number of neurons, activation functions, learning rate, optimizer, and collocation points) must be carefully tuned. The performance of the PBPK-iPINN approach is then compared with established traditional numerical and statistical methods.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Michigan > Lenawee County > Adrian (0.04)
- North America > United States > Michigan > Wayne County > Detroit (0.04)
- Europe > Switzerland (0.04)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
Differentiation-Based Extraction of Proprietary Data from Fine-Tuned LLMs
Li, Zongjie, Wu, Daoyuan, Wang, Shuai, Su, Zhendong
The increasing demand for domain-specific and human-aligned Large Language Models (LLMs) has led to the widespread adoption of Supervised Fine-Tuning (SFT) techniques. SFT datasets often comprise valuable instruction-response pairs, making them highly valuable targets for potential extraction. This paper studies this critical research problem for the first time. We start by formally defining and formulating the problem, then explore various attack goals, types, and variants based on the unique properties of SFT data in real-world scenarios. Based on our analysis of extraction behaviors of direct extraction, we develop a novel extraction method specifically designed for SFT models, called Differentiated Data Extraction (DDE), which exploits the confidence levels of fine-tuned models and their behavioral differences from pre-trained base models. Through extensive experiments across multiple domains and scenarios, we demonstrate the feasibility of SFT data extraction using DDE. Our results show that DDE consistently outperforms existing extraction baselines in all attack settings. To counter this new attack, we propose a defense mechanism that mitigates DDE attacks with minimal impact on model performance. Overall, our research reveals hidden data leak risks in fine-tuned LLMs and provides insights for developing more secure models.
- Asia > China > Hong Kong (0.76)
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > District of Columbia > Washington (0.05)
- (4 more...)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Government (0.93)
The Influence of the Memory Capacity of Neural DDEs on the Universal Approximation Property
Kuehn, Christian, Kuntz, Sara-Viola
Neural Ordinary Differential Equations (Neural ODEs), which are the continuous-time analog of Residual Neural Networks (ResNets), have gained significant attention in recent years. Similarly, Neural Delay Differential Equations (Neural DDEs) can be interpreted as an infinite depth limit of Densely Connected Residual Neural Networks (DenseResNets). In contrast to traditional ResNet architectures, DenseResNets are feed-forward networks that allow for shortcut connections across all layers. These additional connections introduce memory in the network architecture, as typical in many modern architectures. In this work, we explore how the memory capacity in neural DDEs influences the universal approximation property. The key parameter for studying the memory capacity is the product $K τ$ of the Lipschitz constant and the delay of the DDE. In the case of non-augmented architectures, where the network width is not larger than the input and output dimensions, neural ODEs and classical feed-forward neural networks cannot have the universal approximation property. We show that if the memory capacity $Kτ$ is sufficiently small, the dynamics of the neural DDE can be approximated by a neural ODE. Consequently, non-augmented neural DDEs with a small memory capacity also lack the universal approximation property. In contrast, if the memory capacity $Kτ$ is sufficiently large, we can establish the universal approximation property of neural DDEs for continuous functions. If the neural DDE architecture is augmented, we can expand the parameter regions in which universal approximation is possible. Overall, our results show that by increasing the memory capacity $Kτ$, the infinite-dimensional phase space of DDEs with positive delay $τ>0$ is not sufficient to guarantee a direct jump transition to universal approximation, but only after a certain memory threshold, universal approximation holds.
- Asia > Middle East > Jordan (0.04)
- North America > United States > New York (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- (2 more...)
Deep Discrete Encoders: Identifiable Deep Generative Models for Rich Data with Discrete Latent Layers
In the era of generative AI, deep generative models (DGMs) with latent representations have gained tremendous popularity. Despite their impressive empirical performance, the statistical properties of these models remain underexplored. DGMs are often overparametrized, non-identifiable, and uninterpretable black boxes, raising serious concerns when deploying them in high-stakes applications. Motivated by this, we propose an interpretable deep generative modeling framework for rich data types with discrete latent layers, called Deep Discrete Encoders (DDEs). A DDE is a directed graphical model with multiple binary latent layers. Theoretically, we propose transparent identifiability conditions for DDEs, which imply progressively smaller sizes of the latent layers as they go deeper. Identifiability ensures consistent parameter estimation and inspires an interpretable design of the deep architecture. Computationally, we propose a scalable estimation pipeline of a layerwise nonlinear spectral initialization followed by a penalized stochastic approximation EM algorithm. This procedure can efficiently estimate models with exponentially many latent components. Extensive simulation studies validate our theoretical results and demonstrate the proposed algorithms' excellent performance. We apply DDEs to three diverse real datasets for hierarchical topic modeling, image representation learning, response time modeling in educational testing, and obtain interpretable findings.
- Asia > Middle East > Jordan (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > California (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Education (0.67)
- Transportation (0.66)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.90)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Geologists raise concerns over possible censorship and bias in Chinese chatbot
Geologists have raised concerns about potential Chinese censorship and bias in a chatbot being developed with the backing of the International Union of Geological Sciences (IUGS), one of the world's largest scientific organisations and a Unesco partner. The GeoGPT chatbot is aimed at geoscientists and researchers, particularly in the global south, to help them develop their understanding of earth sciences by drawing on swaths of data and research on billions of years of the planet's history. It is an initiative from Deep-time Digital Earth (DDE), a largely Chinese-funded programme founded in 2019 to enhance international scientific cooperation and help countries to realise the UN's sustainable development goals. Part of the underlying AI for GeoGPT is Qwen, a large language model built by the Chinese tech company Alibaba. Responding to the article, DDE representatives Michael Stephenson, Hans Thybo, Chengshan Wang and Ishwaran Natarajan said the chatbot also used Meta's Llama, another large language model, and that during testing they had not noticed any state censorship, which they said was "unlikely" given that the system was "based entirely in geoscience information".
- Africa > Ghana (0.06)
- North America > United States (0.05)
- Asia > China > Beijing > Beijing (0.05)
- Law > Civil Rights & Constitutional Law (0.84)
- Government > Regional Government > Asia Government > China Government (0.30)
Learning the Delay Using Neural Delay Differential Equations
Oprea, Maria, Walth, Mark, Stephany, Robert, Nothaft, Gabriella Torres, Rodriguez-Gonzalez, Arnaldo, Clark, William
The intersection of machine learning and dynamical systems has generated considerable interest recently. Neural Ordinary Differential Equations (NODEs) represent a rich overlap between these fields. In this paper, we develop a continuous time neural network approach based on Delay Differential Equations (DDEs). Our model uses the adjoint sensitivity method to learn the model parameters and delay directly from data. Our approach is inspired by that of NODEs and extends earlier neural DDE models, which have assumed that the value of the delay is known a priori. We perform a sensitivity analysis on our proposed approach and demonstrate its ability to learn DDE parameters from benchmark systems. We conclude our discussion with potential future directions and applications.
- North America > United States > New York > Tompkins County > Ithaca (0.05)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Europe > Netherlands > South Holland > Dordrecht (0.04)
Data-Driven Encoding: A New Numerical Method for Computation of the Koopman Operator
Ng, Jerry, Asada, Haruhiko Harry
This paper presents a data-driven method for constructing a Koopman linear model based on the Direct Encoding (DE) formula. The prevailing methods, Dynamic Mode Decomposition (DMD) and its extensions are based on least squares estimates that can be shown to be biased towards data that are densely populated. The DE formula consisting of inner products of a nonlinear state transition function with observable functions does not incur this biased estimation problem and thus serves as a desirable alternative to DMD. However, the original DE formula requires knowledge of the nonlinear state equation, which is not available in many practical applications. In this paper, the DE formula is extended to a data-driven method, Data-Driven Encoding (DDE) of Koopman operator, in which the inner products are calculated from data taken from a nonlinear dynamic system. An effective algorithm is presented for the computation of the inner products, and their convergence to true values is proven. Numerical experiments verify the effectiveness of DDE compared to Extended DMD. The experiments demonstrate robustness to data distribution and the convergent properties of DDE, guaranteeing accuracy improvements with additional sample points. Furthermore, DDE is applied to deep learning of the Koopman operator to further improve prediction accuracy.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
Diagnosis of Deep Discrete-Event Systems
Lamperti, Gianfranco (University of Brescia) | Zanella, Marina (University of Brescia) | Zhao, Xiangfu (Yantai University)
An abduction-based diagnosis technique for a class of discrete-event systems (DESs), called deep DESs (DDESs), is presented. A DDES has a tree structure, where each node is a network of communicating automata, called an active unit (AU). The interaction of components within an AU gives rise to emergent events. An emergent event occurs when specific components collectively perform a sequence of transitions matching a given regular language. Any event emerging in an AU triggers the transition of a component in its parent AU. We say that the DDES has a deep behavior, in the sense that the behavior of an AU is governed not only by the events exchanged by the components within the AU but also by the events emerging from child AUs. Deep behavior characterizes not only living beings, including humans, but also artifacts, such as robots that operate in contexts at varying abstraction levels. Surprisingly, experimental results indicate that the hierarchical complexity of the system translates into a decreased computational complexity of the diagnosis task. Hence, the diagnosis technique is shown to be (formally) correct as well as (empirically) efficient.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Switzerland (0.04)
- Asia > China > Shandong Province > Yantai (0.04)
- (19 more...)
Learning Generative Models using Denoising Density Estimators
Bigdeli, Siavash A., Lin, Geng, Portenier, Tiziano, Dunbar, L. Andrea, Zwicker, Matthias
Learning generative probabilistic models that can estimate the continuous density given a set of samples, and that can sample from that density, is one of the fundamental challenges in unsupervised machine learning. In this paper we introduce a new approach to obtain such models based on what we call denoising density estimators (DDEs). A DDE is a scalar function, parameterized by a neural network, that is efficiently trained to represent a kernel density estimator of the data. Leveraging DDEs, our main contribution is to develop a novel approach to obtain generative models that sample from given densities. We prove that our algorithms to obtain both DDEs and generative models are guaranteed to converge to the correct solutions. Advantages of our approach include that we do not require specific network architectures like in normalizing flows, ordinary differential equation solvers as in continuous normalizing flows, nor do we require adversarial training as in generative adversarial networks (GANs). Finally, we provide experimental results that demonstrate practical applications of our technique.
- Europe > Switzerland > Zürich > Zürich (0.14)
- South America > Paraguay > Asunción > Asunción (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- (4 more...)
On Automating the Doctrine of Double Effect
Govindarajulu, Naveen Sundar, Bringsjord, Selmer
The doctrine of double effect ($\mathcal{DDE}$) is a long-studied ethical principle that governs when actions that have both positive and negative effects are to be allowed. The goal in this paper is to automate $\mathcal{DDE}$. We briefly present $\mathcal{DDE}$, and use a first-order modal logic, the deontic cognitive event calculus, as our framework to formalize the doctrine. We present formalizations of increasingly stronger versions of the principle, including what is known as the doctrine of triple effect. We then use our framework to simulate successfully scenarios that have been used to test for the presence of the principle in human subjects. Our framework can be used in two different modes: One can use it to build $\mathcal{DDE}$-compliant autonomous systems from scratch, or one can use it to verify that a given AI system is $\mathcal{DDE}$-compliant, by applying a $\mathcal{DDE}$ layer on an existing system or model. For the latter mode, the underlying AI system can be built using any architecture (planners, deep neural networks, bayesian networks, knowledge-representation systems, or a hybrid); as long as the system exposes a few parameters in its model, such verification is possible. The role of the $\mathcal{DDE}$ layer here is akin to a (dynamic or static) software verifier that examines existing software modules. Finally, we end by presenting initial work on how one can apply our $\mathcal{DDE}$ layer to the STRIPS-style planning model, and to a modified POMDP model.This is preliminary work to illustrate the feasibility of the second mode, and we hope that our initial sketches can be useful for other researchers in incorporating DDE in their own frameworks.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- (9 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)