Statistical Learning
A Nonparametric Conjugate Prior Distribution for the Maximizing Argument of a Noisy Function
Ortega, Pedro A., Grau-Moya, Jordi, Genewein, Tim, Balduzzi, David, Braun, Daniel A.
We propose a novel Bayesian approach to solve stochastic optimization problems that involve finding extrema of noisy, nonlinear functions. Previous work has focused on representing possible functions explicitly, which leads to a two-step procedure of first, doing inference over the function space and second, finding the extrema of these functions. Here we skip the representation step and directly model the distribution over extrema. To this end, we devise a non-parametric conjugate prior based on a kernel regressor. The resulting posterior distribution directly captures the uncertainty over the maximum of the unknown function. We illustrate the effectiveness of our model by optimizing a noisy, high-dimensional, non-convex objective function.
Block Modeling in Large Social Networks with Many Clusters
Biesan, Shawn (Baldwin Wallace University) | Anthony, Adam (Baldwin Wallace University) | desJardins, Marie (University of Maryland Baltimore County)
In this paper, we present an optimized version of the previously developed Block Modularity algorithm (Anthony,2009). The original algorithm was a fast, greedy method that effectively discovered a structured clustering in linked data and scaled very well with the number of nodes and edges. The optimized version is scalable in terms of the model complexity; the technique can now be used effectively to discover thousands of clusters in data sets with hundreds of thousands (and possibly more) nodes and edges. The optimization leads to an improvement of the runtime per iteration from cubic to quadratic with a small increase in the constant factor. The algorithm compares favorably with Karrer and Newman's Degree-Corrected Block Model (DCBM) in both runtime and quality of results.
Learning to Select and Generalize Striking Movements in Robot Table Tennis
Muelling, Katharina (Max Planck Institute for Intelligent Systems) | Kober, Jens (Max Planck Institute for Intelligent Systems) | Kroemer, Oliver (Technische Universitaet Darmstadt) | Peters, Jan (Technische Universitaet Darmstadt)
Learning new motor tasks autonomously from interaction with a human being is an important goal for both robotics and machine learning. However, when moving beyond basic skills, most monolithic machine learning approaches fail to scale. In this paper, we take the task of learning table tennis as an example and present a new framework which allows a robot to learn cooperative table tennis from interaction with a human. Therefore, the robot first learns a set of elementary table tennis hitting movements from a human teacher by kinesthetic teach-in, which is compiled into a set of dynamical system motor primitives (DMPs). Subsequently, the system generalizes these movements to a wider range of situations using our mixture of motor primitives (MoMP) approach. The resulting policy enables the robot to select appropriate motor primitives as well as to generalize between them. Finally, the robot plays with a human table tennis partner and learns online to improve its behavior.
An Intelligent Nutritional Assessment System
Eskin, Yulia (University of Toronto) | Mihailidis, Alex (University of Toronto)
Higher life expectancies lead to an increased prevalenceof dementia in older adults, which is projected torise dramatically in the future. The link between malnutritionand dementia highlights the need to closelymonitor nutrition as early as possible. However, currentself-report assessment methods are labor-intensive,time-consuming and inaccurate. Technology has the potentialof assisting in nutritional analysis by alleviatingthe cognitive load of recording food intake and lesseningthe burden of care for the elderly. Therefore, we proposean intelligent nutritional assessment system thatwill monitor the dietary patterns of older adults with dementiaat their homes. Our computer vision-based systemconsists of food recognition and portion estimationalgorithms that, together, provide nutritional analysisof an image of a meal. We create a novel food imagedataset on which we achieve an 87.2% recognition accuracy.We apply several well-known segmentation andrecognition algorithms and analyze their suitability tothe food recognition problem.
Learning to Avoid Collisions
Sklar, Elizabeth (Brooklyn College, City University of New York) | Parsons, Simon (Brooklyn College, City University of New York) | Epstein, Susan L. (Hunter College, City University of New York) | Ozgelen, Arif Tuna (The Graduate Center, City University of New York) | Munoz, Juan Pablo (The Graduate Center, City University of New York) | Abbasi, Farah (College of Staten Island, City University of New York) | Schneider, Eric (Hunter College, City University of New York) | Costantino, Michael (College of Staten Island, City University of New York)
Members of a multi-robot team, operating within close quarters, need to avoid crashing into each other. Simple collision avoidance methods can be used to prevent such collisions, typically by computing the distance to other robots and stopping, perhaps moving away, when this distance falls below a certain threshold. While this approach may avoid disaster, it may also reduce the team's efficiency if robots halt for a long time to let others pass by or if they travel further to move around one another. This paper reports on experiments where a human operator, through a graphical user interface, watches robots perform an exploration task. The operator can manually suspend robots' movements before they crash into each other, and then resume their movements when their paths are clear. Experiment logs record the robots' states when they are paused and resumed. A behavior pattern for collision avoidance is learned, by classifying the states of the robots' environment when the human operator issues "wait" and "resume" commands. Preliminary results indicate that it is possible to learn a classifier which models these behavior patterns, and that different human operators consider different factors when making decisions about stopping and starting robots.
Between Instruction and Reward: Human-Prompted Switching
Pilarski, Patrick M. (University of Alberta) | Sutton, Richard S. (University of Alberta)
Intelligent systems promise to amplify, augment, and extend innate human abilities. A principal example is that of assistive rehabilitation robots---artificial intelligence and machine learning enable new electromechanical systems that restore biological functions lost through injury or illness. In order for an intelligent machine to assist a human user, it must be possible for a human to communicate their intentions and preferences to their non-human counterpart. While there are a number of techniques that a human can use to direct a machine learning system, most research to date has focused on the contrasting strategies of instruction and reward. The primary contribution of our work is to demonstrate that the middle ground between instruction and reward is a fertile space for research and immediate technological progress. To support this idea, we introduce the setting of human-prompted switching, and illustrate the successful combination of switching with interactive learning using a concrete real-world example: human control of a multi-joint robot arm. We believe techniques that fall between the domains of instruction and reward are complementary to existing approaches, and will open up new lines of rapid progress for interactive human training of machine learning systems.
Active Imitation Learning via Reduction to I.I.D. Active Learning
Judah, Kshitij (Oregon State University) | Fern, Alan Paul (Oregon State University) | Dietterich, Thomas Glenn (Oregon State University)
In standard passive imitation learning, the goal is to learn an expertโs policy by passively observing full execution trajectories of it. Unfortunately, generating such trajectories can require substantial expert effort and be impractical in some cases. In this paper, we consider Active Imitation Learning (AIL) with the goal of reducing this effort by querying the expert about the desired action at individual states, which are selected based on answers to past queries and the learnerโs interactions with an environment simulator. Our new approach is based on reducing AIL to i.i.d. active learning, which can leverage progress in the i.i.d. setting. We introduce and analyze reductions for both non-stationary and stationary policies, showing that the label complexity (number of queries) of AIL can be substantially less than passive learning. We also introduce a practical algorithm inspired by the reductions, which is shown to be highly effective in four test domains compared to a number of alternatives.
Novel Interaction Strategies for Learning from Teleoperation
Akgun, Baris (Georgia Institute of Technology) | Subramanian, Kaushik (Georgia Institute of Technology) | Thomaz, Andrea Lockerd (Georgia Institute of Technology)
The field of robot Learning from Demonstration (LfD) makes use of several input modalities for demonstrations (teleoperation, kinesthetic teaching, marker- and vision-based motion tracking). In this paper we present two experiments aimed at identifying and overcoming challenges associated with using teleoperation as an input modality for LfD. Our first experiment compares kinesthetic teaching and teleoperation and highlights some inherent problems associated with teleoperation; specifically uncomfortable user interactions and inaccurate robot demonstrations. Our second experiment is focused on overcoming these problems and designing the teleoperation interaction to be more suitable for LfD. In previous work we have proposed a novel demonstration strategy using the concept of keyframes, where demonstrations are in the form of a discrete set of robot configurations. Keyframes can be naturally combined with continuous trajectory demonstrations to generate a hybrid strategy. We perform user studies to evaluate each of these demonstration strategies individually and show that keyframes are intuitive to the users and are particularly useful in providing noise-free demonstrations. We find that users prefer the hybrid strategy best for demonstrating tasks to a robot by teleoperation.
Global and Local Approach of Part-of-Speech Tagging for Large Corpora
Yu, Shi (University of Chicago) | Grossman, Robert (University of Chicago) | Rzhetsky, Andrey (University of Chicago)
We present Global-Local POS tagging, a framework to train generative stochastic Part-of-Speech models on large corpora. Global Taggers offer several advantages over their counter parts trained on small, curated corpus, including the ability to automatically extend and update their models to new text. Global Taggers also avoid a fundamental limitation of current models, whose performance heavily relies on curated text with manually assigned labels. We illustrate our approach by training several Global Taggers, implemented with generative stochastic models, on two large corpora using high performance computing architecture. We further demonstrate that global taggers can be improved by incorporating models trained on curated text, called Local Taggers, for better tagging performance derived from specific topics.
PROBE: Periodic Random Orbiter Algorithm for Machine Learning
Smith, Larry (National Institutes of Health) | Kim, Won (National Institutes of Health) | Wilbur, W. John
We present a new algorithm, which we call PROBE, to find the minimum of a convex function. Such a minimization is important in many machine learning methods, including Support Vector Machines (SVM). We show that PROBE is a viable alternative to published algorithms for SVM learning with several important advantages. PROBE is a simple and easily programmed algorithm, with a well-defined, parametrized stopping criterion; it is not limited to SVM, but can be applied to other convex loss functions, such as the Huber and Maximum Entropy models; and its time and memory requirements are consistently modest in handling very large training sets.