Education
Human Natural Instruction of a Simulated Electronic Student
Kaochar, Tasneem (University of Arizona) | Peralta, Raquel Torres (University of Arizona) | Morrison, Clayton T. (University of Arizona) | Walsh, Thomas J. (University of Arizona) | Fasel, Ian R. (University of Arizona) | Beyon, Sumin (University of Arizona) | Tran, Anh (University of Arizona) | Wright, Jeremy (University of Arizona) | Cohen, Paul R. (University of Arizona)
Humans naturally use multiple modes of instruction while teaching one another. We would like our robots and artificial agents to be instructed in the same way, rather than programmed. In this paper, we review prior work on human instruction of autonomous agents and present observations from two exploratory pilot studies and the results of a full study investigating how multiple instruction modes are used by humans. We describe our Bootstrapped Learning User Interface, a prototype multiinstruction interface informed by our human-user studies.
Added Value of Sociofact Analysis for Business Agility
Riss, Uwe V. (SAP Research) | Magenheim, Johannes (University of Paderborn) | Reinhardt, Wolfgang (University of Paderborn) | Nelkner, Tobias (University of Paderborn) | Hinkelmann, Knut (FHNW University of Applied Sciences Northwestern Switzerland)
The increasing agility of business requires an accelerated adaptation of organizations to continuously changing conditions. Individual and organizational learning are prominent means to achieve this. Hereby learning is always accompanied by the development of knowledge artifacts. For the entire of learning and artifact development the term knowledge maturing has been introduced recently, which focuses on these three manifestations of knowledge: cognifacts, sociofacts, and artifacts. In this paper we will focus on sociofacts as the subject-bound knowledge manifestation of social actions. Sociofacts are rooted in respective cognifacts play an independent role due to their binding to collective actions and subjects. These are particularly difficult to grasp but play a decisive role for the performance of organizations and the collaboration in there.The presented paper approaches the notion of sociofacts, discusses them on a theoretical level and establishes a first formal notation for sociofacts. We use the case of a merger between two companies to describe the advantages of sociofact analysis for such process. Some sociofact related problems during a merger are described and possible solutions are presented. We identify technical approaches for seizing sociofacts from tool-mediated social interaction and discuss open question for future research.
Artificial Intelligence and Risk Communication
Green, Nancy L. (University of North Carolina Greensboro)
The challenges of effective health risk communication are well known. This paper provides pointers to the health communication literature that discuss these problems. Tailoring printed information, visual displays, and interactive multimedia have been proposed in the health communication literature as promising approaches. On-line risk communication applications are increasing on the internet. However, potential effectiveness of applications using conventional computer technology is limited. We propose that use of artificial intelligence, building upon research in Intelligent Tutoring Systems, might be able to overcome these limitations.
Dr. Vicky: A Virtual Coach for Learning Brief Negotiated Interview Techniques for Treating Emergency Room Patients
Magerko, Brian (Georgia Institute of Technology) | Deen, James (Georgia Institute of Technolog) | Idnani, Avinash (Georgia Institute of Technolog) | Pantalon, Michael (Yale University) | DโOnofrio, Gail (Yale University )
This article presents our work on building a virtual coach agent, called Dr. Vicky, and training environment (called the Virtual BNI Trainer, or VBT) for learning how to correctly talk with medical patients who have substance abuse issues. This work focuses on how to effectively design menu-based dialogue interactions for conversing with a virtual patient within the context of learning how to properly engage in such conversations according to the brief negotiated interview techniques we desire to train. Dr. Vicky also employs a model of student knowledge to influence the mediation strategies used in personalizing the training experience and guidance offered. The VBT is a prototype training application that will be used by medical students and practitioners within the Yale medical community in the future.
Reinforcement Learning with Human Feedback in Mountain Car
Knox, W. Bradley (University of Texas at Austin) | Setapen, Adam Bradley (Massachusetts Institute of Technology) | Stone, Peter (University of Texas at Austin)
As computational agents are increasingly used beyond research labs, their success will depend on their ability to learn new skills and adapt to their dynamic, complex environments. If human users โ without programming skills โ can transfer their task knowledge to the agents, learning rates can increase dramatically, reducing costly trials. The TAMER framework guides the design of agents whose behavior can be shaped through signals of approval and disapproval, a natural form of human feedback. Whereas early work on TAMER assumed that the agent's only feedback was from the human teacher, this paper considers the scenario of an agent within a Markov decision process (MDP), receiving and simultaneously learning from both MDP reward and human reinforcement signals. Preserving MDP reward as the determinant of optimal behavior, we test two methods of combining human reinforcement and MDP reward and analyze their respective performances. Both methods create a predictive model, H-hat, of human reinforcement and use that model in different ways to augment a reinforcement learning (RL) algorithm. We additionally introduce a technique for appropriately determining the magnitude of the model's influence on the RL algorithm throughout time and the state space.
PATSy and VL-PATSy: Online Case-Based Training for Healthcare Professionals
Cox, Richard J. (University of Edinburgh)
This paper describes PATSy, an online repository of virtual patient cases for training and research for >students and practitioners in the clinical sciences. A typical student session with PATSy is illustrated. An extension to PATSy that adds vicarious learning resources (VL-PATSy) is also described. The concept of vicarious learning is outlined and results from a study of learning outcomes from VL-PATSy are presented. PATSy and VL-PATSy will be demonstrated at the symposium.
Refining Recency Search Results with User Click Feedback
Moon, Taesup, Chu, Wei, Li, Lihong, Zheng, Zhaohui, Chang, Yi
Traditional machine-learned ranking systems for web search are often trained to capture stationary relevance of documents to queries, which has limited ability to track non-stationary user intention in a timely manner. In recency search, for instance, the relevance of documents to a query on breaking news often changes significantly over time, requiring effective adaptation to user intention. In this paper, we focus on recency search and study a number of algorithms to improve ranking results by leveraging user click feedback. Our contributions are three-fold. First, we use real search sessions collected in a random exploration bucket for \emph{reliable} offline evaluation of these algorithms, which provides an unbiased comparison across algorithms without online bucket tests. Second, we propose a re-ranking approach to improve search results for recency queries using user clicks. Third, our empirical comparison of a dozen algorithms on real-life search data suggests importance of a few algorithmic choices in these applications, including generalization across different query-document pairs, specialization to popular queries, and real-time adaptation of user clicks.
A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning
Ross, Stephane, Gordon, Geoffrey J., Bagnell, J. Andrew
Sequential prediction problems such as imitation learning, where future observations depend on previous predictions (actions), violate the common i.i.d. assumptions made in statistical learning. This leads to poor performance in theory and often in practice. Some recent approaches provide stronger guarantees in this setting, but remain somewhat unsatisfactory as they train either non-stationary or stochastic policies and require a large number of iterations. In this paper, we propose a new iterative algorithm, which trains a stationary deterministic policy, that can be seen as a no regret algorithm in an online learning setting. We show that any such no regret algorithm, combined with additional reduction assumptions, must find a policy with good performance under the distribution of observations it induces in such sequential settings. We demonstrate that this new approach outperforms previous approaches on two challenging imitation learning problems and a benchmark sequence labeling problem.
Narrowing the Modeling Gap: A Cluster-Ranking Approach to Coreference Resolution
Traditional learning-based coreference resolvers operate by training the mention-pair model for determining whether two mentions are coreferent or not. Though conceptually simple and easy to understand, the mention-pair model is linguistically rather unappealing and lags far behind the heuristic-based coreference models proposed in the pre-statistical NLP era in terms of sophistication. Two independent lines of recent research have attempted to improve the mention-pair model, one by acquiring the mention-ranking model to rank preceding mentions for a given anaphor, and the other by training the entity-mention model to determine whether a preceding cluster is coreferent with a given mention. We propose a cluster-ranking approach to coreference resolution, which combines the strengths of the mention-ranking model and the entity-mention model, and is therefore theoretically more appealing than both of these models. In addition, we seek to improve cluster rankers via two extensions: (1) lexicalization and (2) incorporating knowledge of anaphoricity by jointly modeling anaphoricity determination and coreference resolution. Experimental results on the ACE data sets demonstrate the superior performance of cluster rankers to competing approaches as well as the effectiveness of our two extensions.
Distributed Autonomous Online Learning: Regrets and Intrinsic Privacy-Preserving Properties
Yan, Feng, Sundaram, Shreyas, Vishwanathan, S. V. N., Qi, Yuan
Online learning has become increasingly popular on handling massive data. The sequential nature of online learning, however, requires a centralized learner to store data and update parameters. In this paper, we consider online learning with {\em distributed} data sources. The autonomous learners update local parameters based on local data sources and periodically exchange information with a small subset of neighbors in a communication network. We derive the regret bound for strongly convex functions that generalizes the work by Ram et al. (2010) for convex functions. Most importantly, we show that our algorithm has \emph{intrinsic} privacy-preserving properties, and we prove the sufficient and necessary conditions for privacy preservation in the network. These conditions imply that for networks with greater-than-one connectivity, a malicious learner cannot reconstruct the subgradients (and sensitive raw data) of other learners, which makes our algorithm appealing in privacy sensitive applications.