Goto

Collaborating Authors

 Matsubara, Takashi


Learning Hamiltonian Density Using DeepONet

arXiv.org Artificial Intelligence

In recent years, deep learning for modeling physical phenomena which can be described by partial differential equations (PDEs) have received significant attention. For example, for learning Hamiltonian mechanics, methods based on deep neural networks such as Hamiltonian Neural Networks (HNNs) and their variants have achieved progress. However, existing methods typically depend on the discretization of data, and the determination of required differential operators is often necessary. Instead, in this work, we propose an operator learning approach for modeling wave equations. In particular, we present a method to compute the variational derivatives that are needed to formulate the equations using the automatic differentiation algorithm. The experiments demonstrated that the proposed method is able to learn the operator that defines the Hamiltonian density of waves from data with unspecific discretization without determination of the differential operators.


Poisson-Dirac Neural Networks for Modeling Coupled Dynamical Systems across Domains

arXiv.org Artificial Intelligence

Deep learning has achieved great success in modeling dynamical systems, providing data-driven simulators to predict complex phenomena, even without known governing equations. However, existing models have two major limitations: their narrow focus on mechanical systems and their tendency to treat systems as monolithic. These limitations reduce their applicability to dynamical systems in other domains, such as electrical and hydraulic systems, and to coupled systems. To address these limitations, we propose Poisson-Dirac Neural Networks (PoDiNNs), a novel framework based on the Dirac structure that unifies the port-Hamiltonian and Poisson formulations from geometric mechanics. This framework enables a unified representation of various dynamical systems across multiple domains as well as their interactions and degeneracies arising from couplings. Our experiments demonstrate that PoDiNNs offer improved accuracy and interpretability in modeling unknown coupled dynamical systems from data.


Deep Curvilinear Editing: Commutative and Nonlinear Image Manipulation for Pretrained Deep Generative Model

arXiv.org Artificial Intelligence

Semantic editing of images is the fundamental goal of computer vision. Although deep learning methods, such as generative adversarial networks (GANs), are capable of producing high-quality images, they often do not have an inherent way of editing generated images semantically. Recent studies have investigated a way of manipulating the latent variable to determine the images to be generated. However, methods that assume linear semantic arithmetic have certain limitations in terms of the quality of image editing, whereas methods that discover nonlinear semantic pathways provide non-commutative editing, which is inconsistent when applied in different orders. This study proposes a novel method called deep curvilinear editing (DeCurvEd) to determine semantic commuting vector fields on the latent space. We theoretically demonstrate that owing to commutativity, the editing of multiple attributes depends only on the quantities and not on the order. Furthermore, we experimentally demonstrate that compared to previous methods, the nonlinear and commutative nature of DeCurvEd facilitates the disentanglement of image attributes and provides higher-quality editing.


Good Lattice Training: Physics-Informed Neural Networks Accelerated by Number Theory

arXiv.org Artificial Intelligence

Physics-informed neural networks (PINNs) offer a novel and efficient approach to solving partial differential equations (PDEs). Their success lies in the physics-informed loss, which trains a neural network to satisfy a given PDE at specific points and to approximate the solution. However, the solutions to PDEs are inherently infinite-dimensional, and the distance between the output and the solution is defined by an integral over the domain. Therefore, the physics-informed loss only provides a finite approximation, and selecting appropriate collocation points becomes crucial to suppress the discretization errors, although this aspect has often been overlooked. In this paper, we propose a new technique called good lattice training (GLT) for PINNs, inspired by number theoretic methods for numerical analysis. GLT offers a set of collocation points that are effective even with a small number of points and for multi-dimensional spaces. Our experiments demonstrate that GLT requires 2--20 times fewer collocation points (resulting in lower computational cost) than uniformly random sampling or Latin hypercube sampling, while achieving competitive performance.


FINDE: Neural Differential Equations for Finding and Preserving Invariant Quantities

arXiv.org Artificial Intelligence

Many real-world dynamical systems are associated with first integrals (a.k.a. The discovery and understanding of first integrals are fundamental and important topics both in the natural sciences and in industrial applications. First integrals arise from the conservation laws of system energy, momentum, and mass, and from constraints on states; these are typically related to specific geometric structures of the governing equations. Existing neural networks designed to ensure such first integrals have shown excellent accuracy in modeling from data. However, these models incorporate the underlying structures, and in most situations where neural networks learn unknown systems, these structures are also unknown. This limitation needs to be overcome for scientific discovery and modeling of unknown systems. To this end, we propose first integral-preserving neural differential equation (FINDE). By leveraging the projection method and the discrete gradient method, FINDE finds and preserves first integrals from data, even in the absence of prior knowledge about underlying structures. Experimental results demonstrate that FINDE can predict future states of target systems much longer and find various quantities consistent with well-known first integrals in a unified manner. Modeling and predicting real-world systems are fundamental aspects of understanding the world in natural science and improving computer simulations in industry. Target systems include chemical dynamics for discovering new drugs (Raff et al., 2012), climate dynamics for climate change prediction and weather forecasting (Rasp et al., 2020; Trigo & Palutikof, 1999), and physical dynamics of vehicles and robots for optimal control (Nelles, 2001). In addition to image processing and natural language processing (Devlin et al., 2018; He et al., 2016), neural networks have been actively studied for modeling dynamical systems (Nelles, 2001). Their history dates back to at least the 1990s (see Chen et al. (1990); Clouse et al. (1997); Levin & Narendra (1995); Narendra & Parthasarathy (1990); Sjรถberg et al. (1994); Wang & Lin (1998) for examples). Recently, two notable but distinct families have been proposed. Physics-informed neural networks (PINNs) directly solve partial differential equations (PDEs) given as symbolic equations (Raissi et al., 2019). Neural ordinary differential equations (NODEs) learn ordinary differential equations (ODEs) from observed data and solve them using numerical integrators (Chen et al., 2018). Our focus this time is on NODEs. Most real-world systems are associated with first integrals (a.k.a. First integrals arise from intrinsic geometric structures of systems and are sometimes more important than superficial dynamics in understanding systems (see Appendix A for details).


Automated Cancer Subtyping via Vector Quantization Mutual Information Maximization

arXiv.org Artificial Intelligence

Cancer subtyping is crucial for understanding the nature of tumors and providing suitable therapy. However, existing labelling methods are medically controversial, and have driven the process of subtyping away from teaching signals. Moreover, cancer genetic expression profiles are high-dimensional, scarce, and have complicated dependence, thereby posing a serious challenge to existing subtyping models for outputting sensible clustering. In this study, we propose a novel clustering method for exploiting genetic expression profiles and distinguishing subtypes in an unsupervised manner. The proposed method adaptively learns categorical correspondence from latent representations of expression profiles to the subtypes output by the model. By maximizing the problem -- agnostic mutual information between input expression profiles and output subtypes, our method can automatically decide a suitable number of subtypes. Through experiments, we demonstrate that our proposed method can refine existing controversial labels, and, by further medical analysis, this refinement is proven to have a high correlation with cancer survival rates.


Exploring Uncertainty Measures for Image-Caption Embedding-and-Retrieval Task

arXiv.org Machine Learning

With the wide development of black-box machine learning algorithms, particularly deep neural network (DNN), the practical demand for the reliability assessment is rapidly rising. On the basis of the concept that `Bayesian deep learning knows what it does not know,' the uncertainty of DNN outputs has been investigated as a reliability measure for the classification and regression tasks. However, in the image-caption retrieval task, well-known samples are not always easy-to-retrieve samples. This study investigates two aspects of image-caption embedding-and-retrieval systems. On one hand, we quantify feature uncertainty by considering image-caption embedding as a regression task, and use it for model averaging, which can improve the retrieval performance. On the other hand, we further quantify posterior uncertainty by considering the retrieval as a classification task, and use it as a reliability measure, which can greatly improve the retrieval performance by rejecting uncertain queries. The consistent performance of two uncertainty measures is observed with different datasets (MS COCO and Flickr30k), different deep learning architectures (dropout and batch normalization), and different similarity functions.


Deep Generative Model using Unregularized Score for Anomaly Detection with Heterogeneous Complexity

arXiv.org Machine Learning

Accurate and automated detection of anomalous samples in a natural image dataset can be accomplished with a probabilistic model for end-to-end modeling of images. Such images have heterogeneous complexity, however, and a probabilistic model overlooks simply shaped objects with small anomalies. This is because the probabilistic model assigns undesirably lower likelihoods to complexly shaped objects that are nevertheless consistent with set standards. To overcome this difficulty, we propose an unregularized score for deep generative models (DGMs), which are generative models leveraging deep neural networks. We found that the regularization terms of the DGMs considerably influence the anomaly score depending on the complexity of the samples. By removing these terms, we obtain an unregularized score, which we evaluated on a toy dataset and real-world manufacturing datasets. Empirical results demonstrate that the unregularized score is robust to the inherent complexity of samples and can be used to better detect anomalies.


Deep Neural Generative Model of Functional MRI Images for Psychiatric Disorder Diagnosis

arXiv.org Machine Learning

Accurate diagnosis of psychiatric disorders plays a critical role in improving quality of life for patients and potentially supports the development of new treatments. Many studies have been conducted on machine learning techniques that seek brain imaging data for specific biomarkers of disorders. These studies have encountered the following dilemma: An end-to-end classification overfits to a small number of high-dimensional samples but unsupervised feature-extraction has the risk of extracting a signal of no interest. In addition, such studies often provided only diagnoses for patients without presenting the reasons for these diagnoses. This study proposed a deep neural generative model of resting-state functional magnetic resonance imaging (fMRI) data. The proposed model is conditioned by the assumption of the subject's state and estimates the posterior probability of the subject's state given the imaging data, using Bayes' rule. This study applied the proposed model to diagnose schizophrenia and bipolar disorders. Diagnosis accuracy was improved by a large margin over competitive approaches, namely a support vector machine, logistic regression, and multilayer perceptron with or without unsupervised feature-extractors in addition to a Gaussian mixture model. The proposed model visualizes brain regions largely related to the disorders, thus motivating further biological investigation.