Goto

Collaborating Authors

 Okamoto, Yuji


Training Physical Neural Networks for Analog In-Memory Computing

arXiv.org Artificial Intelligence

Deep learning is a state-of-the-art methodology in numerous domains, including image recognition, natural language processing, and data generation [1]. The discovery of scaling laws in deep learning models [2, 3] has motivated the development of increasingly larger models, commonly referred to as foundation models [4, 5, 6]. Recent studies have shown that reasoning tasks can be improved through iterative computations during the inference phase [7]. While computational power continues to be a major driver of artificial intelligence (AI) advancements, the associated costs remain a significant barrier to broader adoption across diverse industries [8, 9]. This issue is especially critical in edge AI systems, where energy consumption is constrained by the limited capacity of batteries, making the need for more efficient computation paramount [10]. One promising strategy to enhance energy efficiency is fabricating dedicated hardware. Since matrixvector multiplication is the computational core in deep learning, parallelization greatly enhances computational efficiency [11]. Moreover, in data-driven applications such as deep learning, a substantial portion of power consumption is due to data movement between the processor and memory, commonly referred to as the von Neumann bottleneck [12].


Learning Deep Dissipative Dynamics

arXiv.org Artificial Intelligence

This study challenges strictly guaranteeing ``dissipativity'' of a dynamical system represented by neural networks learned from given time-series data. Dissipativity is a crucial indicator for dynamical systems that generalizes stability and input-output stability, known to be valid across various systems including robotics, biological systems, and molecular dynamics. By analytically proving the general solution to the nonlinear Kalman-Yakubovich-Popov (KYP) lemma, which is the necessary and sufficient condition for dissipativity, we propose a differentiable projection that transforms any dynamics represented by neural networks into dissipative ones and a learning method for the transformed dynamics. Utilizing the generality of dissipativity, our method strictly guarantee stability, input-output stability, and energy conservation of trained dynamical systems. Finally, we demonstrate the robustness of our method against out-of-domain input through applications to robotic arms and fluid dynamics. Code here https://github.com/kojima-r/DeepDissipativeModel


A New Deep State-Space Analysis Framework for Patient Latent State Estimation and Classification from EHR Time Series Data

arXiv.org Artificial Intelligence

Many diseases, including cancer and chronic conditions, require extended treatment periods and long-term strategies. Machine learning and AI research focusing on electronic health records (EHRs) have emerged to address this need. Effective treatment strategies involve more than capturing sequential changes in patient test values. It requires an explainable and clinically interpretable model by capturing the patient's internal state over time. In this study, we propose the "deep state-space analysis framework," using time-series unsupervised learning of EHRs with a deep state-space model. This framework enables learning, visualizing, and clustering of temporal changes in patient latent states related to disease progression. We evaluated our framework using time-series laboratory data from 12,695 cancer patients. By estimating latent states, we successfully discover latent states related to prognosis. By visualization and cluster analysis, the temporal transition of patient status and test items during state transitions characteristic of each anticancer drug were identified. Our framework surpasses existing methods in capturing interpretable latent space. It can be expected to enhance our comprehension of disease progression from EHRs, aiding treatment adjustments and prognostic determinations.