rwr
Universal Multilayer Network Exploration by Random Walk with Restart
Baptista, Anthony, Gonzalez, Aitor, Baudot, Anaïs
The amount and variety of data is increasing drastically for several years. These data are often represented as networks, which are then explored with approaches arising from network theory. Recent years have witnessed the extension of network exploration methods to leverage more complex and richer network frameworks. Random walks, for instance, have been extended to explore multilayer networks. However, current random walk approaches are limited in the combination and heterogeneity of network layers they can handle. New analytical and numerical random walk methods are needed to cope with the increasing diversity and complexity of multilayer networks. We propose here MultiXrank, a Python package that enables Random Walk with Restart (RWR) on any kind of multilayer network with an optimized implementation. This package is supported by a universal mathematical formulation of the RWR. We evaluated MultiXrank with leave-one-out cross-validation and link prediction, and introduced protocols to measure the impact of the addition or removal of multilayer network data on prediction performances. We further measured the sensitivity of MultiXrank to input parameters by in-depth exploration of the parameter space. Finally, we illustrate the versatility of MultiXrank with different use-cases of unsupervised node prioritization and supervised classification in the context of human genetic diseases.
- North America > United States (0.14)
- Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.05)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- (2 more...)
- Information Technology > Information Management > Search (1.00)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Communications > Networks (1.00)
- (2 more...)
Reward-Weighted Regression Converges to a Global Optimum
Štrupl, Miroslav, Faccio, Francesco, Ashley, Dylan R., Srivastava, Rupesh Kumar, Schmidhuber, Jürgen
Reward-Weighted Regression (RWR) belongs to a family of widely known iterative Reinforcement Learning algorithms based on the Expectation-Maximization framework. In this family, learning at each iteration consists of sampling a batch of trajectories using the current policy and fitting a new policy to maximize a return-weighted log-likelihood of actions. Although RWR is known to yield monotonic improvement of the policy under certain circumstances, whether and under which conditions RWR converges to the optimal policy have remained open questions. In this paper, we provide for the first time a proof that RWR converges to a global optimum when no function approximation is used.
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
- Europe > Switzerland (0.04)
- North America > United States > Washington > King County > Bellevue (0.04)
- (9 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.84)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
Hierarchical Policy Search via Return-Weighted Density Estimation
Osa, Takayuki (University of Tokyo / RIKEN) | Sugiyama, Masashi (RIKEN / University of Tokyo)
Learning an optimal policy from a multi-modal reward function is a challenging problem in reinforcement learning (RL). Hierarchical RL (HRL) tackles this problem by learning a hierarchicalpolicy, where multiple option policies are in charge of different strategies corresponding to modes of a reward function and a gating policy selects the best option for a given context. Although HRL has been demonstrated to be promising, current state-of-the-art methods cannot still perform well in complex real-world problems due to the difficulty of identifying modes of the reward function. In this paper, we propose a novel method called hierarchical policy search via return-weighted density estimation (HPSDE), which can efficiently identify the modes through density estimation with return-weighted importance sampling. Our proposed method finds option policies corresponding to the modes of the return function and automatically determines the number and the location of option policies, which significantly reduces the burden of hyper-parameters tuning. Through experiments, we demonstrate that the proposed HPSDE successfully learns option policies corresponding to modes of the return function and that it can be successfully applied to a motion planning problem of a redundant robotic manipulator.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Asia > Japan > Honshū > Kantō > Chiba Prefecture > Chiba (0.04)
- North America > United States > New York (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.92)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)
Hierarchical Policy Search via Return-Weighted Density Estimation
Osa, Takayuki, Sugiyama, Masashi
Learning an optimal policy from a multi-modal reward function is a challenging problem in reinforcement learning (RL). Hierarchical RL (HRL) tackles this problem by learning a hierarchical policy, where multiple option policies are in charge of different strategies corresponding to modes of a reward function and a gating policy selects the best option for a given context. Although HRL has been demonstrated to be promising, current state-of-the-art methods cannot still perform well in complex real-world problems due to the difficulty of identifying modes of the reward function. In this paper, we propose a novel method called hierarchical policy search via return-weighted density estimation (HPSDE), which can efficiently identify the modes through density estimation with return-weighted importance sampling. Our proposed method finds option policies corresponding to the modes of the return function and automatically determines the number and the location of option policies, which significantly reduces the burden of hyper-parameters tuning. Through experiments, we demonstrate that the proposed HPSDE successfully learns option policies corresponding to modes of the return function and that it can be successfully applied to a challenging motion planning problem of a redundant robotic manipulator.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Asia > Japan > Honshū > Kantō > Chiba Prefecture > Chiba (0.04)
- North America > United States > New York (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.92)
Grammar-Based Random Walkers in Semantic Networks
Semantic networks qualify the meaning of an edge relating any two vertices. Determining which vertices are most "central" in a semantic network is difficult because one relationship type may be deemed subjectively more important than another. For this reason, research into semantic network metrics has focused primarily on context-based rankings (i.e. user prescribed contexts). Moreover, many of the current semantic network metrics rank semantic associations (i.e. directed paths between two vertices) and not the vertices themselves. This article presents a framework for calculating semantically meaningful primary eigenvector-based metrics such as eigenvector centrality and PageRank in semantic networks using a modified version of the random walker model of Markov chain analysis. Random walkers, in the context of this article, are constrained by a grammar, where the grammar is a user defined data structure that determines the meaning of the final vertex ranking. The ideas in this article are presented within the context of the Resource Description Framework (RDF) of the Semantic Web initiative.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
- North America > United States > Hawaii (0.04)
- (6 more...)