Maruhashi, Koji
A Latent Diffusion Model for Protein Structure Generation
Fu, Cong, Yan, Keqiang, Wang, Limei, Au, Wing Yee, McThrow, Michael, Komikado, Tao, Maruhashi, Koji, Uchino, Kanji, Qian, Xiaoning, Ji, Shuiwang
Proteins are complex biomolecules that perform a variety of crucial functions within living organisms. Designing and generating novel proteins can pave the way for many future synthetic biology applications, including drug discovery. However, it remains a challenging computational task due to the large modeling space of protein structures. In this study, we propose a latent diffusion model that can reduce the complexity of protein modeling while flexibly capturing the distribution of natural protein structures in a condensed latent space. Specifically, we propose an equivariant protein autoencoder that embeds proteins into a latent space and then uses an equivariant diffusion model to learn the distribution of the latent protein representations. Experimental results demonstrate that our method can effectively generate novel protein backbone structures with high designability and efficiency.
Learning Large Causal Structures from Inverse Covariance Matrix via Matrix Decomposition
Dong, Shuyu, Uemura, Kento, Fujii, Akito, Chang, Shuang, Koyanagi, Yusuke, Maruhashi, Koji, Sebag, Michèle
Learning causal structures from observational data is a fundamental yet highly complex problem when the number of variables is large. In this paper, we start from linear structural equation models (SEMs) and investigate ways of learning causal structures from the inverse covariance matrix. The proposed method, called $\mathcal{O}$-ICID (for {\it Independence-preserving} Decomposition from Oracle Inverse Covariance matrix), is based on continuous optimization of a type of matrix decomposition that preserves the nonzero patterns of the inverse covariance matrix. We show that $\mathcal{O}$-ICID provides an efficient way for identifying the true directed acyclic graph (DAG) under the knowledge of noise variances. With weaker prior information, the proposed method gives directed graph solutions that are useful for making more refined causal discovery. The proposed method enjoys a low complexity when the true DAG has bounded node degrees, as reflected by its time efficiency in experiments in comparison with state-of-the-art algorithms.
Automated Data Augmentations for Graph Classification
Luo, Youzhi, McThrow, Michael, Au, Wing Yee, Komikado, Tao, Uchino, Kanji, Maruhashi, Koji, Ji, Shuiwang
Data augmentations are effective in improving the invariance of learning machines. We argue that the core challenge of data augmentations lies in designing data transformations that preserve labels. This is relatively straightforward for images, but much more challenging for graphs. In this work, we propose GraphAug, a novel automated data augmentation method aiming at computing label-invariant augmentations for graph classification. Instead of using uniform transformations as in existing studies, GraphAug uses an automated augmentation model to avoid compromising critical label-related information of the graph, thereby producing label-invariant augmentations at most times. To ensure label-invariance, we develop a training method based on reinforcement learning to maximize an estimated label-invariance probability. Experiments show that GraphAug outperforms previous graph augmentation methods on various graph classification tasks.
Crowdsourcing Evaluation of Saliency-based XAI Methods
Lu, Xiaotian, Tolmachev, Arseny, Yamamoto, Tatsuya, Takeuchi, Koh, Okajima, Seiji, Takebayashi, Tomoyoshi, Maruhashi, Koji, Kashima, Hisashi
Understanding the reasons behind the predictions made by deep neural networks is critical for gaining human trust in many important applications, which is reflected in the increasing demand for explainability in AI (XAI) in recent years. Saliency-based feature attribution methods, which highlight important parts of images that contribute to decisions by classifiers, are often used as XAI methods, especially in the field of computer vision. In order to compare various saliency-based XAI methods quantitatively, several approaches for automated evaluation schemes have been proposed; however, there is no guarantee that such automated evaluation metrics correctly evaluate explainability, and a high rating by an automated evaluation scheme does not necessarily mean a high explainability for humans. In this study, instead of the automated evaluation, we propose a new human-based evaluation scheme using crowdsourcing to evaluate XAI methods. Our method is inspired by a human computation game, "Peek-a-boom", and can efficiently compare different XAI methods by exploiting the power of crowds. We evaluate the saliency maps of various XAI methods on two datasets with automated and crowd-based evaluation schemes. Our experiments show that the result of our crowd-based evaluation scheme is different from those of automated evaluation schemes. In addition, we regard the crowd-based evaluation results as ground truths and provide a quantitative performance measure to compare different automated evaluation schemes. We also discuss the impact of crowd workers on the results and show that the varying ability of crowd workers does not significantly impact the results.
Bermuda Triangles: GNNs Fail to Detect Simple Topological Structures
Tolmachev, Arseny, Sakai, Akira, Todoriki, Masaru, Maruhashi, Koji
Most graph neural network architectures work by message-passing node vector embeddings over the adjacency matrix, and it is assumed that they capture graph topology by doing that. We design two synthetic tasks, focusing purely on topological problems - triangle detection and clique distance - on which graph neural networks perform surprisingly badly, failing to detect those "bermuda" triangles. Many tasks need to handle the graph representation of data in areas such as chemistry (Wale & Karypis, Method Triangles Clique 2006), social networks (Fan et al., 2019), and transportation GCN 50.0 50.0 (Zhao et al., 2019). Furthermore, it is not GCN D 75.7 83.2 limited to these graph tasks but also includes images GCN D ID 80.4 83.4 (Chen et al., 2019) and 3D polygons (Shi & Rajkumar, GIN 74.1 97 2020) that are possible to convert to graph data GIN D 75.0 99.4 formats. Because of these broad applications, Graph GIN D ID 70.5 100.0 Deep Learning is an important field in machine learning GAT 50.0 50.0 research. GAT D 88.5 99.9 Graph neural networks (GNNs, (Scarselli et al., 2008)) GAT D ID 94.1 100.0 is a common approach to perform machine learning SVM WL 67.2 73.1 with graphs. Most graph neural networks update SVM Graphlets 99.6 60.3 the graph node vector embeddings using the message passing. Node vector embeddings are usually initialized FCNN 55.6 54.6 with data features and local graph features like TF 100.0 70.0 node degrees. Then, for a (n 1)-th stacked layer, the TF AM 100.0 100.0 new node state is computed from the node vector representation TF-IS AM 86.7 100.0 of the previous layer (n).
Linear Tensor Projection Revealing Nonlinearity
Maruhashi, Koji, Park, Heewon, Yamaguchi, Rui, Miyano, Satoru
Dimensionality reduction is an effective method for learning high-dimensional data, which can provide better understanding of decision boundaries in human-readable low-dimensional subspace. Linear methods, such as principal component analysis and linear discriminant analysis, make it possible to capture the correlation between many variables; however, there is no guarantee that the correlations that are important in predicting data can be captured. Moreover, if the decision boundary has strong nonlinearity, the guarantee becomes increasingly difficult. This problem is exacerbated when the data are matrices or tensors that represent relationships between variables. We propose a learning method that searches for a subspace that maximizes the prediction accuracy while retaining as much of the original data information as possible, even if the prediction model in the subspace has strong nonlinearity. This makes it easier to interpret the mechanism of the group of variables behind the prediction problem that the user wants to know. We show the effectiveness of our method by applying it to various types of data including matrices and tensors.
Learning Multi-Way Relations via Tensor Decomposition With Neural Networks
Maruhashi, Koji (Fujitsu Laboratories Ltd.) | Todoriki, Masaru (Fujitsu Laboratories Ltd.) | Ohwa, Takuya (Fujitsu Laboratories Ltd.) | Goto, Keisuke (Fujitsu Laboratories Ltd.) | Hasegawa, Yu (Fujitsu Laboratories Ltd.) | Inakoshi, Hiroya (Fujitsu Laboratories Ltd.) | Anai, Hirokazu (Fujitsu Laboratories Ltd.)
How can we classify multi-way data such as network traffic logs with multi-way relations between source IPs, destination IPs, and ports? Multi-way data can be represented as a tensor, and there have been several studies on classification of tensors to date. One critical issue in the classification of multi-way relations is how to extract important features for classification when objects in different multi-way data, i.e., in different tensors, are not necessarily in correspondence. In such situations, we aim to extract features that do not depend on how we allocate indices to an object such as a specific source IP; we are interested in only the structures of the multi-way relations. However, this issue has not been considered in previous studies on classification of multi-way data. We propose a novel method which can learn and classify multi-way data using neural networks. Our method leverages a novel type of tensor decomposition that utilizes a target core tensor expressing the important features whose indices are independent of those of the multi-way data. The target core tensor guides the tensor decomposition into more effective results and is optimized in a supervised manner. Our experiments on three different domains show that our method is highly accurate, especially on higher order data. It also enables us to interpret the classification results along with the matrices calculated with the novel tensor decomposition.