Education
Human-Centered Cognitive Orthoses: Artificial Intelligence for, Rather than Instead of, the People
Neuhaus, Peter (Florida Institute for Human and Machine Cognition (IHMC)) | Raj, Anil (Florida Institute for Human and Machine Cognition (IHMC)) | Clancey, William J. (Florida Institute for Human and Machine Cognition (IHMC))
This issue of AI Magazine includes six articles on cognitive orthoses, which we broadly conceive as technological approaches that amplify or enhance individual or team cognition across a wide range of goals and activities. The articles are grouped by how they relate to orthoses enhanced socio-technical team intelligence at three different cognitive levels—sensorimotor physical, professional learning, and networked knowledge.
What Do You Need to Know to Use a Search Engine? Why We Still Need to Teach Research Skills
For the vast majority of queries (for example, navigation, simple fact lookup, and others), search engines do extremely well. Their ability to quickly provide answers to queries is a remarkable testament to the power of many of the fundamental methods of AI. They also highlight many of the issues that are common to sophisticated AI question-answering systems. It has become clear that people think of search programs in ways that are very different from traditional information sources. Rapid and ready-at-hand access, depth of processing, and the way they enable people to offload some ordinary memory tasks suggest that search engines have become more of a cognitive amplifier than a simple repository or front-end to the Internet. Like all sophisticated tools, people still need to learn how to use them. Although search engines are superb at finding and presenting information—up to and including extracting complex relations and making simple inferences—knowing how to frame questions and evaluate their results for accuracy and credibility remains an ongoing challenge. Some questions are still deep and complex, and still require knowledge on the part of the search user to work through to a successful answer. And the fact that the underlying information content, user interfaces, and capabilities are all in a continual state of change means that searchers need to continually update their knowledge of what these programs can (and cannot) do.
Efficient Learning by Directed Acyclic Graph For Resource Constrained Prediction
Wang, Joseph, Trapeznikov, Kirill, Saligrama, Venkatesh
We study the problem of reducing test-time acquisition costs in classification systems. Our goal is to learn decision rules that adaptively select sensors for each example as necessary to make a confident prediction. We model our system as a directed acyclic graph (DAG) where internal nodes correspond to sensor subsets and decision functions at each node choose whether to acquire a new sensor or classify using the available measurements. This problem can be naturally posed as an empirical risk minimization over training data. Rather than jointly optimizing such a highly coupled and non-convex problem over all decision nodes, we propose an efficient algorithm motivated by dynamic programming. We learn node policies in the DAG by reducing the global objective to a series of cost sensitive learning problems. Our approach is computationally efficient and has proven guarantees of convergence to the optimal system for a fixed architecture. In addition, we present an extension to map other budgeted learning problems with large number of sensors to our DAG architecture and demonstrate empirical performance exceeding state-of-the-art algorithms for data composed of both few and many sensors.
Probabilistic Curve Learning: Coulomb Repulsion and the Electrostatic Gaussian Process
Learning of low dimensional structure in multidimensional data is a canonical problem in machine learning. One common approach is to suppose that the observed data are close to a lower-dimensional smooth manifold. There are a rich variety of manifold learning methods available, which allow mapping of data points to the manifold. However, there is a clear lack of probabilistic methods that allow learning of the manifold along with the generative distribution of the observed data. The best attempt is the Gaussian process latent variable model (GP-LVM), but identifiability issues lead to poor performance. We solve these issues by proposing a novel Coulomb repulsive process (Corp) for locations of points on the manifold, inspired by physical models of electrostatic interactions among particles. Combining this process with a GP prior for the mapping function yields a novel electrostatic GP (electroGP) process. Focusing on the simple case of a one-dimensional manifold, we develop efficient inference algorithms, and illustrate substantially improved performance in a variety of experiments including filling in missing frames in video.
Cognition as a Service: An Industry Perspective
Spohrer, Jim (IBM Research, Almaden) | Banavar, Guruduth (IBM Research)
Recent advances in cognitive computing componentry combined with other factors are leading to commercially viable cognitive systems. From chips to smart phones to public and private clouds, industrial strength “cognition as a service” is beginning to appear at all scales in business and society. Furthermore, in the age of zettabytes on the way to yottabytes, the designers, engineers, and managers of future smart systems will depend on cognition as a service. Cognition as a service can help unlock the mysteries of big data and ultimately boost the creativity and productivity of professionals and their teams, the productive output of industries and organizations, as well as the GDP (gross domestic product) of regions and nations. In this and the next decade, cognition as a service will allow us to re-image work practices, augmenting and scaling expertise to transform professions, industries, and regions.
Cognitive Prosthetics for Fostering Learning: A View from the Learning Sciences
Kolodner, Janet L. (The Concord Consortium)
This article is aimed at helping AI researchers and practitioners imagine roles intelligent technologies might play in the many different and varied ecosystems in which people learn. My observations are based on learning sciences research of the past several decades, the possibilities of new technologies of the past few years, and my experience as program officer for the National Science Foundation’s Cyberlearning and Future Learning Technologies program. My thesis is that new technologies have potential to transform possibilities for fostering learning in both formal and informal learning environments by making it possible and manageable for learners to engage in the kinds of project work that professionals engage in and learn important content, skills, practices, habits, and dispositions from those experiences. The expertise of AI researchers and practitioners is critical to that vision, but it will require teaming up with others — for example, technology imagineers, educators, and learning scientists.
Extending the Diagnostic Capabilities of Artificial Intelligence-Based Instructional Systems
Mathan, Santosh (Honeywell Labs) | Yeung, Nick (University of Oxford)
Active problem solving has been shown to be one of the most effective ways to acquire complex skills. Whether one is learning a programming language by implementing a computer program, or learning calculus by solving problems, context sensitive feedback and guidance are crucial to keeping problem solving efforts fruitful and efficient. This article reviews AI-based algorithms that can diagnose student difficulties during active problem solving and serve as the basis for providing context-sensitive and individualized guidance. The article also describes the crucial role sensor based estimates of cognitive resources such as working memory capacity and attention can play in enhancing the diagnostic capabilities of intelligent instructional systems.
Bayesian dark knowledge
Balan, Anoop Korattikara, Rathod, Vivek, Murphy, Kevin P., Welling, Max
We consider the problem of Bayesian parameter estimation for deep neural networks, which is important in problem settings where we may have little data, and/ or where we need accurate posterior predictive densities p(y|x, D), e.g., for applications involving bandits or active learning. One simple approach to this is to use online Monte Carlo methods, such as SGLD (stochastic gradient Langevin dynamics). Unfortunately, such a method needs to store many copies of the parameters (which wastes memory), and needs to make predictions using many versions of the model (which wastes time).We describe a method for “distilling” a Monte Carlo approximation to the posterior predictive density into a more compact form, namely a single deep neural network. We compare to two very recent approaches to Bayesian neural networks, namely an approach based on expectation propagation [HLA15] and an approach based on variational Bayes [BCKW15]. Our method performs better than both of these, is much simpler to implement, and uses less computation at test time.
Adaptive Online Learning
Foster, Dylan J., Rakhlin, Alexander, Sridharan, Karthik
We propose a general framework for studying adaptive regret bounds in the online learning setting, subsuming model selection and data-dependent bounds. Given a data- or model-dependent bound we ask, “Does there exist some algorithm achieving this bound?” We show that modifications to recently introduced sequential complexity measures can be used to answer this question by providing sufficient conditions under which adaptive rates can be achieved. In particular each adaptive rate induces a set of so-called offset complexity measures, and obtaining small upper bounds on these quantities is sufficient to demonstrate achievability. A cornerstone of our analysis technique is the use of one-sided tail inequalities to bound suprema of offset random processes.Our framework recovers and improves a wide variety of adaptive bounds including quantile bounds, second order data-dependent bounds, and small loss bounds. In addition we derive a new type of adaptive bound for online linear optimization based on the spectral norm, as well as a new online PAC-Bayes theorem.