Country
Scalable POMDPs for Diagnosis and Planning in Intelligent Tutoring Systems
Folsom-Kovarik, Jeremiah T. (University of Central Florida) | Sukthankar, Gita (University of Central Florida) | Schatz, Sae (University of Central Florida) | Nicholson, Denise (University of Central Florida)
A promising application area for proactive assistant agents is automated tutoring and training. Intelligent tutoring systems (ITSs) assist tutors and tutees by automating diagnosis and adaptive tutoring. These tasks are well modeled by a partially observable Markov decision process (POMDP) since it accounts for the uncertainty inherent in diagnosis. However, an important aspect of making POMDP solvers feasible for real-world problems is selecting appropriate representations for states, actions, and observations. This paper studies two scalable POMDP state and observation representations. State queues allow POMDPs to temporarily ignore less-relevant states. Observation chains represent information in independent dimensions using sequences of observations to reduce the size of the observation set. Preliminary experiments with simulated tutees suggest the experimental representations perform as well as lossless POMDPs, and can model much larger problems.
Alignments of Manifold Sections of Different Dimensions in Manifold Learning
Ye, Qiang (University of Kentucky) | Zhi, Weifeng (University of Kentucky)
We consider an alignment algorithm for reconstructing global coordinates from local coordinates constructed for sections of manifolds. We show that, under certain conditions, the align- ment algorithm can successfully recover global coordinates even when local neighborhoods have different dimensions. Our results generalize an earlier analysis to allow alignment of sections of different dimensions. We also apply our result to a semisupervised learning problem.
Applying Diffusion Distance for Multi-Scale Analysis of An Experience Space
Su, Meng (The Pennsylvania State University) | Fan, Xiaocong (The Pennsylvania State University) | Ge, WeiLi (Zhengzhou University)
Diffusion distance has been shown to be significantlymore effective than Euclidean distance in multi-scalerecognition of similar experiences in Recognition-Primed Decision making In this paper, we first examine the experience data set used inthe previous study. The visualization of the data set(using the first three dominant eigenvectors of the diffusion space) suggests the applicability of the diffusion approach. Second, we investigate two approaches to the computation of diffusion distance: Spectrum based and Probability-Matching based. Specifically, by ‘Spectrumbased’ approach we refer to the one derived in terms of the eigenvalues/eigenvectors of the normalized diffusion matrix. We use the term ‘Probability-Matching’ to refer to the use of various probability distances, where the original L2 diffusion distance is treated as a special case. Our preliminary result indicates that the performance of using L2 diffusion distance at least is tied with the use of Spectrum based distance. Furthermore, when spectrum based approach is applied, we have to use the embedding and extending techniques for labeling new experience data, while such recomputation is not necessary when the L2 diffusion distance is used. We do not need to recompute the diffusion matrix, hence the diffusion map each time when adding a new data. It is more natural and robust especially for labeling new single experience data. The numerical examples also show the improvement on the performance. We are currently working on several other Probability-Matching approaches (e.g. the Earth-Mover’s Distance).
Dictionary Optimization for Block-Sparse Representations
Rosenblum, Kevin (Technion - Israel Institute of Technology) | Zelnik-Manor, Lihi (Technion - Israel Institute of Technology) | Eldar, Yonina C. (Technion - Israel Institute of Technology)
Recent work has demonstrated that using a carefully designed dictionary instead of a predefined one, can improve the sparsity in jointly representing a class of signals. This has motivated the derivation of learning methods for designing a dictionary which leads to the sparsest representation for a given set of signals. In some applications, the signals of interest can have further structure, so that they can be well approximated by a union of a small number of subspaces (e.g., face recognition and motion segmentation). This implies the existence of a dictionary which enables block-sparse representations of the input signals once its atoms are properly sorted into blocks. In this paper, we propose an algorithm for learning a block-sparsifying dictionary of a given set of signals. We do not require prior knowledge on the association of signals into groups (subspaces). Instead, we develop a method that automatically detects the underlying block structure. This is achieved by iteratively alternating between updating the block structure of the dictionary and updating the dictionary atoms to better fit the data. Our experiments show that for block-sparse data the proposed algorithm significantly improves the dictionary recovery ability and lowers the representation error compared to dictionary learning methods that do not employ block structure.
Building a Job Lanscape from Directional Transition Data
Perrault-Joncas, Dominique (University of Washington) | Meila, Marina (University of Washington) | Scott, Marc (New York University)
The analysis of career paths suffers from a lack of exploratory tools and dynamic models, due in part to the inherent high dimensionality of the problem. Paths may be understood as directed traversals through a graph whose nodes consist of "job types," which we define as industry and occupation pairs. We want to develop tools to understand and detect high-level features of both the labor market and the workers moving through it — career dynamics. To do this, we map the discrete space of jobs into a d-dimensional continuous space; proximity between jobs will mean that they are "close" to each other in a non-negligible subset of career paths. This embedding allows one to visualize the job landscape. Moreover, we can map individual or groups of career paths to this space, extract features of their collective structure, and construct statistical tests comparing groups by means of this mapping.
On the Curvature of Pattern Transformation Manifolds: Numerical Estimation and Applications
Kokiopoulou, Effrosyni (Swiss Federal Institute of Technology (ETH), Zurich) | Kressner, Daniel (Swiss Federal Institute of Technology (ETH), Zurich) | Frossard, Pascal (Swiss Federal Institute of Technology (EPFL), Lausanne )
This paper addresses the numerical estimation of the principal curvature of pattern transformation manifolds. When a visual pattern undergoes a geometric transformation, it forms a (sub)manifold in the ambient space, which is usually called the transformation manifold. The manifold curvature is an important property characterizing the manifold geometry, with several applications in manifold learning. We propose an efficient numerical algorithm for estimating the principal curvature at a certain point on the transformation manifold.
Hierarchical Clustering Via Localized Diffusion Folders
David, Gil (Yale University) | Averbuch, Amir (Tel-Aviv University) | Coifman, Ronald R. (Yale University)
Data clustering is a common technique for statistical data analysis. It is used in many fields including machine learning, data mining, customer segmentation, trend analysis, pattern recognition and image analysis. The proposed Localized Diffusion Folders methodology performs hierarchical clustering of high-dimensional datasets. The diffusion folders are multi-level data partitioning into local neighborhoods that are generated by several random selections of data points and folders in a diffusion graph and by defining local diffusion distances between them. This multi-level partitioning defines an improved localized geometry of the data and a localized Markov transition matrix that is used for the next time step in the diffusion process. The result of this clustering method is a bottom-up hierarchical clustering of the data while each level in the hierarchy contains localized diffusion folders of folders from the lower levels. This methodology preserves the local neighborhood of each point while eliminating noisy connections between distinct points and areas in the graph. The performance of the algorithms is demonstrated on real data and it is compared to existing methods.
Treating Epilepsy by Reinforcement Learning Via Manifold-Based Simulation
Bush, Keith (University of Arkansas at Little Rock) | Pineau, Joelle ( School of Computer Science McGill University )
The ability to take intelligent actions in real-world domains is a goal of great interest in the machine learning community. Unfortunately, the real-world is filled with systems that can bepartially observed but cannot, as yet, be described by first principlemodels. Moreover, the traditional paradigm of direct interaction with the environment used in reinforcement learning (RL) is often prohibitively expensive in practice. An alternative approach that simultaneously solves both of these problems is to gain experience in simulation; the simulation in this approach is a computational model derived from observations. Advances in sensory and information technology are simplifying the acquisition and distribution of real-world datasets to computational scientists; thus, the barrier to linking intelligent control with real-world domains is becoming one of identifying high-quality state-space and transition functions directly from observations. From a dynamical systems perspective, this barrier is analogous to the problem of finding high-quality manifold embeddings and a rich literature of theory and practice exists to address it. The contribution of this work is two-fold. First, we describe an approach for learning optimal control strategies directly from observations using manifold embeddings as the intermediate state representation. Second, we demonstrate how control strategies constructed in this way can answer important scientific questions. As a concrete example, we use our approach to guide experimental decisions in neurostimulation treatments of epilepsy.
Stratification Learning through Homology Inference
Bendich, Paul (Institute of Science and Technology Austria) | Mukherjee, Sayan (Duke University) | Wang, Bei (Duke University)
We develop a topological approach to stratification learning. Given point cloud data drawn from a stratified space, our objective is to infer which points belong to the same strata. First we define a multi-scale notion of a stratified space, giving a stratification for each radius level. We then use methods derived from kernel and cokernel persistent homology to cluster the data points into different strata, and we prove a result which guarantees the correctness of our clustering, given certain topological conditions. We later give bounds on the minimum number of sample points required to infer, with probability, which points belong to the same strata. Finally, we give an explicit algorithm for the clustering and apply it to some simulated data.
Eye Spy: Improving Vision through Dialog
Vogel, Adam (Stanford University) | Raghunathan, Karthik (Stanford University) | Jurafsky, Dan (Stanford University)
Despite efforts to build robust vision systems, robots in new environments inevitably encounter new objects. Traditional supervised learning requires gathering and annotating sampleimages in the environment, usually in the form of bounding boxes or segmentations. This training interface takes some experience to do correctly and is quite tedious. We report work in progress on a robotic dialog system to learn names and attributes of objects through spoken interaction with a human teacher. The robot and human play a variant of the children’s games “I Spy” and “20 Questions”. In our game, the human places objects of interest in front of the robot, then picks an object in her head. The robot asks a series of natural language questions about the target object, with the goal of pointing at the correct object while asking a minimum number of questions. The questions range from attributes such as color (“Is it red?”) to category questions (“Is it a cup?”). The robot selects questions to ask based on an information gain criteria, seeking to minimize the entropy of the visual model given the answer to the question.