Asia
Implementation of a Practical Distributed Calculation System with Browsers and JavaScript, and Application to Distributed Deep Learning
Deep learning can achieve outstanding results in various fields. However, it requires so significant computational power that graphics processing units (GPUs) and/or numerous computers are often required for the practical application. We have developed a new distributed calculation framework called "Sashimi" that allows any computer to be used as a distribution node only by accessing a website. We have also developed a new JavaScript neural network framework called "Sukiyaki" that uses general purpose GPUs with web browsers. Sukiyaki performs 30 times faster than a conventional JavaScript library for deep convolutional neural networks (deep CNNs) learning. The combination of Sashimi and Sukiyaki, as well as new distribution algorithms, demonstrates the distributed deep learning of deep CNNs only with web browsers on various devices. The libraries that comprise the proposed methods are available under MIT license at http://mil-tokyo.github.io/.
Nonparametric Nearest Neighbor Descent Clustering based on Delaunay Triangulation
Abstract: In our physically inspired in-tree (IT) based clustering algorithm and the series after it, there is only one free parameter involved in computing the potential value of each point. In this work, based on the Delaunay Triangulation or its dual Voronoi tessellation, we propose a nonparametric process to compute potential values by the local information. This computation, though nonparametric, is relatively very rough, and consequently, many local extreme points will be generated. However, unlike those gradient-based methods, our ITbased methods are generally insensitive to those local extremes. This positively demonstrates the superiority of these parametric (previous) and nonparametric (in this work) ITbased methods. 1 Introduction In (1), we proposed a physically inspired clustering algorithm, in which an in-tree (IT) structure was first constructed. This IT structure organizes the data points into the clusters with several undesired connections (edges) between them requiring to be removed.
IT-map: an Effective Nonlinear Dimensionality Reduction Method for Interactive Clustering
In our previous works (1, 2), we have shown its potential in cluster analysis. Combinations of the IT structure with the Semi-Supervised learning concept (3), Rodriguez and Laio's "Decision Graph" (4), and Frey and Dueck's "Affinity Propagation" (AP) (5), have resulted in effective cluster analysis methods. For example, based on the IT structure, the application scope of AP was extended from spherical to nonspherical cluster detection (2). In this paper, we will show another potential of the IT structure: nonlinear dimensionality reduction, for which an effective combination is made with the "isometric mapping" (Isomap) proposed by Tenenbaum et al (6). Isomap is a simple and effective dimensionality reduction method which extends the application scope of multidimensional scaling (MDS) from linear to nonlinear structure. It contains three steps: first construct the K-nearest-neighborhood (KNN) graph, then compute the graph distances (the shortest path distances in the neighborhood graph) and lastly compute the low-dimensional embedding by classical MDS. In effect, the constructed KNN graph for data points is unfolded in the low-dimensional Euclidean space, which is effective especially for preserving in the embedding the topology relationship of data points on manifolds. The crux of the success for Isomap is that it takes as the input for classical MDS the graph distances, instead of the straight-line Euclidian ones, for all pairs of data points.
Estimation of Mental Time by Analysis of Tenses During Conversaion
Onoda, Keisuke (Chiba Unibersity) | Otake, Mihoko (Chiba University, Japan Science and Technology Agency)
The increase of dementia patients is one of the problems caused by aged population not only in Japan but in many developed countries. As cognitive enhancement method for prevention of dementia, coimagination method is proposed: designed group conversation whose themes are selected from recent topics for training of recent episodic memory functions, since recent episodic memory functions decline before the onset of dementia. It is known that people who are disuse particular cognitive functions have higher risk of loosing the functions. However, the participants of the conversations supported by coimagination method sometimes refer to past topics rather than recent topics. The method is required for analyzing whether the topics deal with recent or past for effective intervention, which has not been established. Purpose of this study is to propose method for analyzing the temporal characteristics of the topics during conversation. Mental time travel, or chronesthesia, is the ability to be aware of oneโs present, past or future which has been evolved in humans in particular. In order to estimate the mental time of the speaker from topics during conversation supported by coimagination method, we propose mental time estimation method by analyzing tenses and senses. We applied the method to the scripts of conversation supported by coimagination method. The result suggests that itโs possible to estimate mental time from analysis of the topics in conversation. The ratio of the reference of past, present and future of each speaker was enumerated. The individual differences of the tendencies were demonstrated as the ratios.
Interactive Restless Multi-armed Bandit Game and Swarm Intelligence Effect
Yoshida, Shunsuke, Hisakado, Masato, Mori, Shintaro
We obtain the conditions for the emergence of the swarm intelligence effect in an interactive game of restless multi-armed bandit (rMAB). A player competes with multiple agents. Each bandit has a payoff that changes with a probability $p_{c}$ per round. The agents and player choose one of three options: (1) Exploit (a good bandit), (2) Innovate (asocial learning for a good bandit among $n_{I}$ randomly chosen bandits), and (3) Observe (social learning for a good bandit). Each agent has two parameters $(c,p_{obs})$ to specify the decision: (i) $c$, the threshold value for Exploit, and (ii) $p_{obs}$, the probability for Observe in learning. The parameters $(c,p_{obs})$ are uniformly distributed. We determine the optimal strategies for the player using complete knowledge about the rMAB. We show whether or not social or asocial learning is more optimal in the $(p_{c},n_{I})$ space and define the swarm intelligence effect. We conduct a laboratory experiment (67 subjects) and observe the swarm intelligence effect only if $(p_{c},n_{I})$ are chosen so that social learning is far more optimal than asocial learning.
Directed Information Graphs
Quinn, Christopher J., Kiyavash, Negar, Coleman, Todd P.
We propose a graphical model for representing networks of stochastic processes, the minimal generative model graph. It is based on reduced factorizations of the joint distribution over time. We show that under appropriate conditions, it is unique and consistent with another type of graphical model, the directed information graph, which is based on a generalization of Granger causality. We demonstrate how directed information quantifies Granger causality in a particular sequential prediction setting. We also develop efficient methods to estimate the topological structure from data that obviate estimating the joint statistics. One algorithm assumes upper-bounds on the degrees and uses the minimal dimension statistics necessary. In the event that the upper-bounds are not valid, the resulting graph is nonetheless an optimal approximation. Another algorithm uses near-minimal dimension statistics when no bounds are known but the distribution satisfies a certain criterion. Analogous to how structure learning algorithms for undirected graphical models use mutual information estimates, these algorithms use directed information estimates. We characterize the sample-complexity of two plug-in directed information estimators and obtain confidence intervals. For the setting when point estimates are unreliable, we propose an algorithm that uses confidence intervals to identify the best approximation that is robust to estimation error. Lastly, we demonstrate the effectiveness of the proposed algorithms through analysis of both synthetic data and real data from the Twitter network. In the latter case, we identify which news sources influence users in the network by merely analyzing tweet times.
Principal Sensitivity Analysis
Koyamada, Sotetsu, Koyama, Masanori, Nakae, Ken, Ishii, Shin
We present a novel algorithm (Principal Sensitivity Analysis; PSA) to analyze the knowledge of the classifier obtained from supervised machine learning techniques. In particular, we define principal sensitivity map (PSM) as the direction on the input space to which the trained classifier is most sensitive, and use analogously defined k -th PSM to define a basis for the input space. We train neural networks with artificial data and real data, and apply the algorithm to the obtained supervised classifiers. We then visualize the PSMs to demonstrate the PSA's ability to decompose the knowledge acquired by the trained classifiers.
Quantum Structure in Cognition, Origins, Developments, Successes and Expectations
Aerts, Diederik, Sozzo, Sandro
We provide an overview of the results we have attained in the last decade on the identification of quantum structures in cognition and, more specifically, in the formalization and representation of natural concepts. We firstly discuss the quantum foundational reasons that led us to investigate the mechanisms of formation and combination of concepts in human reasoning, starting from the empirically observed deviations from classical logical and probabilistic structures. We then develop our quantum-theoretic perspective in Fock space which allows successful modeling of various sets of cognitive experiments collected by different scientists, including ourselves. In addition, we formulate a unified explanatory hypothesis for the presence of quantum structures in cognitive processes, and discuss our recent discovery of further quantum aspects in concept combinations, namely, 'entanglement' and 'indistinguishability'. We finally illustrate perspectives for future research.
L_1-regularized Boltzmann machine learning using majorizer minimization
We propose an inference method to estimate sparse interactions and biases according to Boltzmann machine learning. The basis of this method is $L_1$ regularization, which is often used in compressed sensing, a technique for reconstructing sparse input signals from undersampled outputs. $L_1$ regularization impedes the simple application of the gradient method, which optimizes the cost function that leads to accurate estimations, owing to the cost function's lack of smoothness. In this study, we utilize the majorizer minimization method, which is a well-known technique implemented in optimization problems, to avoid the non-smoothness of the cost function. By using the majorizer minimization method, we elucidate essentially relevant biases and interactions from given data with seemingly strongly-correlated components.
Minimax Optimal Rates of Estimation in High Dimensional Additive Models: Universal Phase Transition
Our results reveal an interesting phase transition behavior universal to this class of high dimensional problems. In the sparse regime when the components are sufficiently smooth or the dimensionality is sufficiently large, the optimal rates are identical to those for high dimensional linear regression, and therefore there is no additional cost to entertain a nonparametric model. Otherwise, in the so-called smooth regime, the rates coincide with the optimal rates for estimating a univariate function, and therefore they are immune to the "curse of dimensionality". Key words: Convergence rate, method of regularization, minimax optimality, phase transition, reproducing kernel Hilbert space, Sobolev space. 2 1 Introduction With the recent advances in science and technology, high dimensional regression problems have become ubiquitous in a multitude of areas - genomics, medical imaging, and finance are a few well known examples. Considerable amount of research effort has been devoted to the understanding of challenges brought about by the high dimensionality, and development of statistical methodology to counter them.