Kawahara, Yoshinobu
Learning Stochastic Nonlinear Dynamics with Embedded Latent Transfer Operators
Ke, Naichang, Tanaka, Ryogo, Kawahara, Yoshinobu
We consider an operator-based latent Markov representation of a stochastic nonlinear dynamical system, where the stochastic evolution of the latent state embedded in a reproducing kernel Hilbert space is described with the corresponding transfer operator, and develop a spectral method to learn this representation based on the theory of stochastic realization. The embedding may be learned simultaneously using reproducing kernels, for example, constructed with feed-forward neural networks. We also address the generalization of sequential state-estimation (Kalman filtering) in stochastic nonlinear systems, and of operator-based eigen-mode decomposition of dynamics, for the representation. Several examples with synthetic and real-world data are shown to illustrate the empirical characteristics of our methods, and to investigate the performance of our model in sequential state-estimation and mode decomposition.
Change-Point Detection in Industrial Data Streams based on Online Dynamic Mode Decomposition with Control
Wadinger, Marek, Kvasnica, Michal, Kawahara, Yoshinobu
We propose a novel change-point detection method based on online Dynamic Mode Decomposition with control (ODMDwC). Leveraging ODMDwC's ability to find and track linear approximation of a non-linear system while incorporating control effects, the proposed method dynamically adapts to its changing behavior due to aging and seasonality. This approach enables the detection of changes in spatial, temporal, and spectral patterns, providing a robust solution that preserves correspondence between the score and the extent of change in the system dynamics. We formulate a truncated version of ODMDwC and utilize higher-order time-delay embeddings to mitigate noise and extract broad-band features. Our method addresses the challenges faced in industrial settings where safety-critical systems generate non-uniform data streams while requiring timely and accurate change-point detection to protect profit and life. Our results demonstrate that this method yields intuitive and improved detection results compared to the Singular-Value-Decomposition-based method. We validate our approach using synthetic and real-world data, showing its competitiveness to other approaches on complex systems' benchmark datasets. Provided guidelines for hyperparameters selection enhance our method's practical applicability.
Koopman Spectrum Nonlinear Regulators and Efficient Online Learning
Ohnishi, Motoya, Ishikawa, Isao, Lowrey, Kendall, Ikeda, Masahiro, Kakade, Sham, Kawahara, Yoshinobu
Most modern reinforcement learning algorithms optimize a cumulative single-step cost along a trajectory. The optimized motions are often 'unnatural', representing, for example, behaviors with sudden accelerations that waste energy and lack predictability. In this work, we present a novel paradigm of controlling nonlinear systems via the minimization of the Koopman spectrum cost: a cost over the Koopman operator of the controlled dynamics. This induces a broader class of dynamical behaviors that evolve over stable manifolds such as nonlinear oscillators, closed loops, and smooth movements. We demonstrate that some dynamics characterizations that are not possible with a cumulative cost are feasible in this paradigm, which generalizes the classical eigenstructure and pole assignments to nonlinear decision making. Moreover, we present a sample efficient online learning algorithm for our problem that enjoys a sub-linear regret bound under some structural assumptions.
The Disappearance of Timestep Embedding in Modern Time-Dependent Neural Networks
Kim, Bum Jun, Kawahara, Yoshinobu, Kim, Sang Woo
Dynamical systems are often time-varying, whose modeling requires a function that evolves with respect to time. Recent studies such as the neural ordinary differential equation proposed a time-dependent neural network, which provides a neural network varying with respect to time. However, we claim that the architectural choice to build a time-dependent neural network significantly affects its time-awareness but still lacks sufficient validation in its current states. In this study, we conduct an in-depth analysis of the architecture of modern time-dependent neural networks. Here, we report a vulnerability of vanishing timestep embedding, which disables the time-awareness of a time-dependent neural network. Furthermore, we find that this vulnerability can also be observed in diffusion models because they employ a similar architecture that incorporates timestep embedding to discriminate between different timesteps during a diffusion process. Our analysis provides a detailed description of this phenomenon as well as several solutions to address the root cause. Through experiments on neural ordinary differential equations and diffusion models, we observed that ensuring alive time-awareness via proposed solutions boosted their performance, which implies that their current implementations lack sufficient time-dependency.
Koopman operators with intrinsic observables in rigged reproducing kernel Hilbert spaces
Ishikawa, Isao, Hashimoto, Yuka, Ikeda, Masahiro, Kawahara, Yoshinobu
This paper presents a novel approach for estimating the Koopman operator defined on a reproducing kernel Hilbert space (RKHS) and its spectra. We propose an estimation method, what we call Jet Dynamic Mode Decomposition (JetDMD), leveraging the intrinsic structure of RKHS and the geometric notion known as jets to enhance the estimation of the Koopman operator. This method refines the traditional Extended Dynamic Mode Decomposition (EDMD) in accuracy, especially in the numerical estimation of eigenvalues. This paper proves JetDMD's superiority through explicit error bounds and convergence rate for special positive definite kernels, offering a solid theoretical foundation for its performance. We also delve into the spectral analysis of the Koopman operator, proposing the notion of extended Koopman operator within a framework of rigged Hilbert space. This notion leads to a deeper understanding of estimated Koopman eigenfunctions and capturing them outside the original function space. Through the theory of rigged Hilbert space, our study provides a principled methodology to analyze the estimated spectrum and eigenfunctions of Koopman operators, and enables eigendecomposition within a rigged RKHS. We also propose a new effective method for reconstructing the dynamical system from temporally-sampled trajectory data of the dynamical system with solid theoretical guarantee. We conduct several numerical simulations using the van der Pol oscillator, the Duffing oscillator, the H\'enon map, and the Lorenz attractor, and illustrate the performance of JetDMD with clear numerical computations of eigenvalues and accurate predictions of the dynamical systems.
Glocal Hypergradient Estimation with Koopman Operator
Hataya, Ryuichiro, Kawahara, Yoshinobu
Gradient-based hyperparameter optimization methods update hyperparameters using hypergradients, gradients of a meta criterion with respect to hyperparameters. Previous research used two distinct update strategies: optimizing hyperparameters using global hypergradients obtained after completing model training or local hypergradients derived after every few model updates. While global hypergradients offer reliability, their computational cost is significant; conversely, local hypergradients provide speed but are often suboptimal. In this paper, we propose glocal hypergradient estimation, blending "global" quality with "local" efficiency. To this end, we use the Koopman operator theory to linearize the dynamics of hypergradients so that the global hypergradients can be efficiently approximated only by using a trajectory of local hypergradients. Consequently, we can optimize hyperparameters greedily using estimated global hypergradients, achieving both reliability and efficiency simultaneously. Through numerical experiments of hyperparameter optimization, including optimization of optimizers, we demonstrate the effectiveness of the glocal hypergradient estimation.
Adaptive action supervision in reinforcement learning from real-world multi-agent demonstrations
Fujii, Keisuke, Tsutsui, Kazushi, Scott, Atom, Nakahara, Hiroshi, Takeishi, Naoya, Kawahara, Yoshinobu
Modeling of real-world biological multi-agents is a fundamental problem in various scientific and engineering fields. Reinforcement learning (RL) is a powerful framework to generate flexible and diverse behaviors in cyberspace; however, when modeling real-world biological multi-agents, there is a domain gap between behaviors in the source (i.e., real-world data) and the target (i.e., cyberspace for RL), and the source environment parameters are usually unknown. In this paper, we propose a method for adaptive action supervision in RL from real-world demonstrations in multi-agent scenarios. We adopt an approach that combines RL and supervised learning by selecting actions of demonstrations in RL based on the minimum distance of dynamic time warping for utilizing the information of the unknown source dynamics. This approach can be easily applied to many existing neural network architectures and provide us with an RL model balanced between reproducibility as imitation and generalization ability to obtain rewards in cyberspace. In the experiments, using chase-and-escape and football tasks with the different dynamics between the unknown source and target environments, we show that our approach achieved a balance between the reproducibility and the generalization ability compared with the baselines. In particular, we used the tracking data of professional football players as expert demonstrations in football and show successful performances despite the larger gap between behaviors in the source and target environments than the chase-and-escape task.
Fast, accurate, and interpretable decoding of electrocorticographic signals using dynamic mode decomposition
Fukuma, Ryohei, Majima, Kei, Kawahara, Yoshinobu, Yamashita, Okito, Shiraishi, Yoshiyuki, Kishima, Haruhiko, Yanagisawa, Takufumi
Dynamic mode (DM) decomposition decomposes spatiotemporal signals into basic oscillatory components (DMs). DMs can improve the accuracy of neural decoding when used with the nonlinear Grassmann kernel, compared to conventional power features. However, such kernel-based machine learning algorithms have three limitations: large computational time preventing real-time application, incompatibility with non-kernel algorithms, and low interpretability. Here, we propose a mapping function corresponding to the Grassmann kernel that explicitly transforms DMs into spatial DM (sDM) features, which can be used in any machine learning algorithm. Using electrocorticographic signals recorded during various movement and visual perception tasks, the sDM features were shown to improve the decoding accuracy and computational time compared to conventional methods. Furthermore, the components of the sDM features informative for decoding showed similar characteristics to the high-$\gamma$ power of the signals, but with higher trial-to-trial reproducibility. The proposed sDM features enable fast, accurate, and interpretable neural decoding.
Many-body Approximation for Non-negative Tensors
Ghalamkari, Kazu, Sugiyama, Mahito, Kawahara, Yoshinobu
We present an alternative approach to decompose non-negative tensors, called many-body approximation. Traditional decomposition methods assume low-rankness in the representation, resulting in difficulties in global optimization and target rank selection. We avoid these problems by energy-based modeling of tensors, where a tensor and its mode correspond to a probability distribution and a random variable, respectively. Our model can be globally optimized in terms of the KL divergence minimization by taking the interaction between variables (that is, modes), into account that can be tuned more intuitively than ranks. Furthermore, we visualize interactions between modes as tensor networks and reveal a nontrivial relationship between many-body approximation and low-rank approximation. We demonstrate the effectiveness of our approach in tensor completion and approximation.
Stable Invariant Models via Koopman Spectra
Konishi, Takuya, Kawahara, Yoshinobu
Weight-tied models have attracted attention in the modern development of neural networks. The deep equilibrium model (DEQ) represents infinitely deep neural networks with weight-tying, and recent studies have shown the potential of this type of approach. DEQs are needed to iteratively solve root-finding problems in training and are built on the assumption that the underlying dynamics determined by the models converge to a fixed point. In this paper, we present the stable invariant model (SIM), a new class of deep models that in principle approximates DEQs under stability and extends the dynamics to more general ones converging to an invariant set (not restricted in a fixed point). The key ingredient in deriving SIMs is a representation of the dynamics with the spectra of the Koopman and Perron--Frobenius operators. This perspective approximately reveals stable dynamics with DEQs and then derives two variants of SIMs. We also propose an implementation of SIMs that can be learned in the same way as feedforward models. We illustrate the empirical performance of SIMs with experiments and demonstrate that SIMs achieve comparative or superior performance against DEQs in several learning tasks.