Goto

Collaborating Authors

 Search


Experience Reuse with Probabilistic Movement Primitives

arXiv.org Machine Learning

Acquiring new robot motor skills is cumbersome, as learning a skill from scratch and without prior knowledge requires the exploration of a large space of motor configurations. Accordingly, for learning a new task, time could be saved by restricting the parameter search space by initializing it with the solution of a similar task. We present a framework which is able of such knowledge transfer from already learned movement skills to a new learning task. The framework combines probabilistic movement primitives with descriptions of their effects for skill representation. New skills are first initialized with parameters inferred from related movement primitives and thereafter adapted to the new task through relative entropy policy search. We compare two different transfer approaches to initialize the search space distribution with data of known skills with a similar effect. We show the different benefits of the two knowledge transfer approaches on an object pushing task for a simulated 3-DOF robot. We can show that the quality of the learned skills improves and the required iterations to learn a new task can be reduced by more than 60% when past experiences are utilized.


Multi-owner Secure Encrypted Search Using Searching Adversarial Networks

arXiv.org Artificial Intelligence

Searchable symmetric encryption (SSE) for multi-owner model draws much attention as it enables data users to perform sear ches over encrypted cloud data outsourced by data owners. However, im plement-ing secure and precise query, efficient search and flexible dyn amic system maintenance at the same time in SSE remains a challenge. To ad dress this, this paper proposes secure and efficient multi-keyword ranked search over encrypted cloud data for multi-owner model based on sea rching adversarial networks. We exploit searching adversarial netw orks to achieve optimal pseudo-keyword padding, and obtain the optimal gam e equilibrium for query precision and privacy protection strength. M aximum likelihood search balanced tree is generated by probabilistic l earning, which achieves efficient search and brings the computational compl exity close to O (log N). In addition, we enable flexible dynamic system maintenanc e with balanced index forest that makes full use of distribute d computing. Compared with previous works, our solution maintains query precision above 95% while ensuring adequate privacy protection, and i ntroduces low overhead on computation, communication and storage.


Identification of relevant diffusion MRI metrics impacting cognitive functions using a novel feature selection method

arXiv.org Machine Learning

Mild Traumatic Brain Injury (mTBI) is a significant public health problem. The most troubling symptoms after mTBI are cognitive complaints. Studies show measurable differences between patients with mTBI and healthy controls with respect to tissue microstructure using diffusion MRI. However, it remains unclear which diffusion measures are the most informative with regard to cognitive functions in both the healthy state as well as after injury. In this study, we use diffusion MRI to formulate a predictive model for performance on working memory based on the most relevant MRI features. The key challenge is to identify relevant features over a large feature space with high accuracy in an efficient manner. To tackle this challenge, we propose a novel improvement of the best first search approach with crossover operators inspired by genetic algorithm. Compared against other heuristic feature selection algorithms, the proposed method achieves significantly more accurate predictions and yields clinically interpretable selected features.


Fully Convolutional Search Heuristic Learning for Rapid Path Planners

arXiv.org Artificial Intelligence

Path-planning algorithms are an important part of a wide variety of robotic applications, such as mobile robot navigation and robot arm manipulation. However, in large search spaces in which local traps may exist, it remains challenging to reliably find a path while satisfying real-time constraints. Efforts to speed up the path search have led to the development of many practical path-planning algorithms. These algorithms often define a search heuristic to guide the search towards the goal. The heuristics should be carefully designed for each specific problem to ensure reliability in the various situations encountered in the problem. However, it is often difficult for humans to craft such robust heuristics, and the search performance often degrades under conditions that violate the heuristic assumption. Rather than manually designing the heuristics, in this work, we propose a learning approach to acquire these search heuristics. Our method represents the environment containing the obstacles as an image, and this image is fed into fully convolutional neural networks to produce a search heuristic image where every pixel represents a heuristic value (cost-to-go value to a goal) in the form of a vertex of a search graph. Training the heuristic is performed using previously collected planning results. Our preliminary experiments (2D grid world navigation experiments) demonstrate significant reduction in the search costs relative to a hand-designed heuristic.


Autonomous Target Search with Multiple Coordinated UAVs

Journal of Artificial Intelligence Research

Search and tracking is the problem of locating a moving target and following it to its destination. In this work, we consider a scenario in which the target moves across a large geographical area by following a road network and the search is performed by a team of unmanned aerial vehicles (UAVs). We formulate search and tracking as a combinatorial optimization problem and prove that the objective function is submodular. We exploit this property to devise a greedy algorithm. Although this algorithm does not offer strong theoretical guarantees because of the presence of temporal constraints that limit the feasibility of the solutions, it presents remarkably good performance, especially when several UAVs are available for the mission. As the greedy algorithm suffers when resources are scarce, we investigate two alternative optimization techniques: Constraint Programming (CP) and AI planning. Both approaches struggle to cope with large problems, and so we strengthen them by leveraging the greedy algorithm. We use the greedy solution to warm start the CP model and to devise a domain-dependent heuristic for planning. Our extensive experimental evaluation studies the scalability of the different techniques and identifies the conditions under which one approach becomes preferable to the others.


Unifying System Health Management and Automated Decision Making

Journal of Artificial Intelligence Research

Health management of complex dynamic systems has evolved from simple automated alarms into a subfield of artificial intelligence with techniques for analyzing off-nominal conditions and generating responses. This evolution took place largely apart from the development of automated system control, planning, and scheduling (generally referred to in this work as decision making). While there have been efforts to establish an information exchange between system health management and decision making, successful practical implementations of integrated architectures remain limited. This article proposes that rather than being treated as connected yet distinct entities, system health management and decision making should be unified in their formulations. Enabled by advances in modeling and algorithms, we believe that a unified approach will increase systems' resilience to faults and improve their effectiveness. We overview the prevalent system health management methodology, illustrate its limitations through numerical examples, and describe a proposed unified approach. We then show how typical system health management concepts are accommodated in the proposed approach without loss of functionality or generality. A computational complexity analysis of the unified approach is also provided.


How much data is sufficient to learn high-performing algorithms?

arXiv.org Machine Learning

Algorithms for scientific analysis typically have tunable parameters that significantly influence computational efficiency and solution quality. If a parameter setting leads to strong algorithmic performance on average over a set of typical problem instances, that parameter setting---ideally---will perform well in the future. However, if the set of typical problem instances is small, average performance will not generalize to future performance. This raises the question: how large should this set be? We answer this question for any algorithm satisfying an easy-to-describe, ubiquitous property: its performance is a piecewise-structured function of its parameters. We are the first to provide a unified sample complexity framework for algorithm parameter configuration; prior research followed case-by-case analyses. We present applications from diverse domains, including biology, political science, and economics.


Large-scale traffic signal control using machine learning: some traffic flow considerations

arXiv.org Artificial Intelligence

This paper uses supervised learning, random search and deep reinforcement learning (DRL) methods to control large signalized intersection networks. The traffic model is Cellular Automaton rule 184, which has been shown to be a parameter-free representation of traffic flow, and is the most efficient implementation of the Kinematic Wave model with triangular fundamental diagram. We are interested in the steady-state performance of the system, both spatially and temporally: we consider a homogeneous grid network inscribed on a torus, which makes the network boundary-free, and drivers choose random routes. As a benchmark we use the longest-queue-first (LQF) greedy algorithm. We find that: (i) a policy trained with supervised learning with only two examples outperforms LQF, (ii) random search is able to generate near-optimal policies, (iii) the prevailing average network occupancy during training is the major determinant of the effectiveness of DRL policies. When trained under free-flow conditions one obtains DRL policies that are optimal for all traffic conditions, but this performance deteriorates as the occupancy during training increases. For occupancies > 75% during training, DRL policies perform very poorly for all traffic conditions, which means that DRL methods cannot learn under highly congested conditions. We conjecture that DRL's inability to learn under congestion might be explained by a property of urban networks found here, whereby even a very bad policy produces an intersection throughput higher than downstream capacity. This means that the actual throughput tends to be independent of the policy. Our findings imply that it is advisable for current DRL methods in the literature to discard any congested data when training, and that doing this will improve their performance under all traffic conditions.


Online Planning for Decentralized Stochastic Control with Partial History Sharing

arXiv.org Artificial Intelligence

Computational challenges are further compounded if agents do not possess complete model knowledge. In this paper, we take advantage of the fact that in many problems agents share some common information, or history, termed partial history sharing . Under this information structure the policy search space is greatly reduced. We propose a provably convergent, online tree-search based algorithm that does not require a closed-form model or explicit communication among agents. Interestingly, our algorithm can be viewed as a generalization of several existing heuristic solvers for decentralized partially observable Markov decision processes. T o demonstrate the applicability of the model, we propose a novel collaborative intrusion response model, where multiple agents (defenders) possessing asymmetric information aim to collaboratively defend a computer network. Numerical results demonstrate the performance of our algorithm.


ASNets: Deep Learning for Generalised Planning

arXiv.org Artificial Intelligence

In this paper, we discuss the learning of generalised policies for probabilistic and classical planning problems using Action Schema Networks (ASNets). The ASNet is a neural network architecture that exploits the relational structure of (P)PDDL planning problems to learn a common set of weights that can be applied to any problem in a domain. By mimicking the actions chosen by a traditional, non-learning planner on a handful of small problems in a domain, ASNets are able to learn a generalised reactive policy that can quickly solve much larger instances from the domain. This work extends the ASNet architecture to make it more expressive, while still remaining invariant to a range of symmetries that exist in PPDDL problems. We also present a thorough experimental evaluation of ASNets, including a comparison with heuristic search planners on seven probabilistic and deterministic domains, an extended evaluation on over 18,000 Blocksworld instances, and an ablation study. Finally, we show that sparsity-inducing regularisation can produce ASNets that are compact enough for humans to understand, yielding insights into how the structure of ASNets allows them to generalise across a domain.