Undirected Networks
Promoting Motivation and Self-Regulated Learning Skills through Social Interactions in Agent-based Learning Environments
Biswas, Gautam (Vanderbilt University) | Jeong, Hogyeong (Vanderbilt University) | Roscoe, Rod (Vanderbilt University) | Sulcer, Brian (Vanderbilt University)
We have developed computer environments that support learning by teaching and the use of self regulated learning (SRL) skills through interactions with virtual agents. More specifically, students teach a computer agent, Betty, and can monitor her progress by asking her questions and getting her to take quizzes. The system provides SRL support via dialog-embedded prompts by Betty, the teachable agent, and Mr. Davis, the mentor agent. Our primary goals have been to support learning in complex science domains and facilitate development of metacognitive skills. More recently, we have also employed sequence analysis schemes and hidden Markov model (HMM) methods for assigning context to and deriving aggregated student behavior sequences from activity data. These techniques allow us to go beyond analyses of individual behaviors, instead examining how these behaviors cohere in larger patterns. We discuss the information derived from these models, and draw inferences on students’ use of self-regulated learning strategies.
Efficient Bayesian analysis of multiple changepoint models with dependence across segments
We consider Bayesian analysis of a class of multiple changepoint models. While there are a variety of efficient ways to analyse these models if the parameters associated with each segment are independent, there are few general approaches for models where the parameters are dependent. Under the assumption that the dependence is Markov, we propose an efficient online algorithm for sampling from an approximation to the posterior distribution of the number and position of the changepoints. In a simulation study, we show that the approximation introduced is negligible. We illustrate the power of our approach through fitting piecewise polynomial models to data, under a model which allows for either continuity or discontinuity of the underlying curve at each changepoint. This method is competitive with, or out-performs, other methods for inferring curves from noisy data; and uniquely it allows for inference of the locations of discontinuities in the underlying curve.
An Immune Inspired Approach to Anomaly Detection
Twycross, Jamie, Aickelin, Uwe
The immune system provides a rich metaphor for computer security: anomaly detection that works in nature should work for machines. However, early artificial immune system approaches for computer security had only limited success. Arguably, this was due to these artificial systems being based on too simplistic a view of the immune system. We present here a second generation artificial immune system for process anomaly detection. It improves on earlier systems by having different artificial cell types that process information. Following detailed information about how to build such second generation systems, we find that communication between cells types is key to performance. Through realistic testing and validation we show that second generation artificial immune systems are capable of anomaly detection beyond generic system policies. The paper concludes with a discussion and outline of the next steps in this exciting area of computer security.
Multi-Agent Online Planning with Communication
Wu, Feng (University of Science and Technology of China) | Zilberstein, Shlomo (University of Massachusetts at Amherst) | Chen, Xiaoping (University of Science and Technology of China)
We propose an online algorithm for planning under uncertainty in multi-agent settings modeled as DEC-POMDPs. The algorithm helps overcome the high computational complexity of solving such problems off-line. The key challenge is to produce coordinated behavior using little or no communication. When communication is allowed but constrained, the challenge is to produce high value with minimal communication. The algorithm addresses these challenges by communicating only when history inconsistency is detected, allowing communication to be postponed if necessary. Moreover, it bounds the memory usage at each step and can be applied to problems with arbitrary horizons. The experimental results confirm that the algorithm can solve problems that are too large for the best existing off-line planning algorithms and it outperforms the best online method, producing higher value with much less communication in most cases.
Exploiting Coordination Locales in Distributed POMDPs via Social Model Shaping
Varakantham, Pradeep (Singapore Management University) | Kwak, Jun-young (University of Southern California) | Taylor, Matthew (University of Southern California) | Marecki, Janusz (IBM T. J Watson Research Center) | Scerri, Paul (Carnegie Mellon University) | Tambe, Milind (University of Southern California)
Distributed POMDPs provide an expressive framework for modeling multiagent collaboration problems, but NEXP-Complete complexity hinders their scalability and application in real-world domains. This paper introduces a subclass of distributed POMDPs, and TREMOR, an algorithm to solve such distributed POMDPs. The primary novelty of TREMOR is that agents plan individually with a single agent POMDP solver and use social model shaping to implicitly coordinate with other agents. Experiments demonstrate that TREMOR can provide solutions orders of magnitude faster than existing algorithms while achieving comparable, or even superior, solution quality.
A Decision-Theoretic Approach to Dynamic Sensor Selection in Camera Networks
Spaan, Matthijs T. J. (Instituto Superior Técnico) | Lima, Pedro U. (Instituto Superior Técnico)
Nowadays many urban areas have been equipped with networks of surveillance cameras, which can be used for automatic localization and tracking of people. However, given the large resource demands of imaging sensors in terms of bandwidth and computing power, processing the image streams of all cameras simultaneously might not be feasible. In this paper, we consider the problem of dynamical sensor selection based on user-defined objectives, such as maximizing coverage or improved localization uncertainty. We propose a decision-theoretic approach modeled as a POMDP, which selects k sensors to consider in the next time frame, incorporating all observations made in the past. We show how, by changing the POMDP's reward function, we can change the system's behavior in a straightforward manner, fulfilling the user's chosen objective. We successfully apply our techniques to a network of 10 cameras.
Navigation Planning in Probabilistic Roadmaps with Uncertainty
Kneebone, Michael (University of Birmingham) | Dearden, Richard (University of Birmingham)
Probabilistic Roadmaps (PRM) are a commonly used class of algorithms for robot navigation tasks where obstacles are present in the environment. We examine the situation where the obstacle positions are not precisely known. A subset of the edges in the PRM graph may possibly intersect the obstacles, and as the robot traverses the graph it can make noisy observations of these uncertain edges to determine if it can traverse them or not. The problem is to traverse the graph from an initial vertex to a goal without taking a blocked edge, and to do this optimally the robot needs to consider the observations it can make as well as the structure of the graph. In this paper we show how this problem can be represented as a POMDP. We show that while too large to be solved with exact methods, approximate point based methods can provide a good quality solution. While feasible for smaller examples, this approach isn't scalable. By exploiting the structure in the belief space, we can construct an approximate belief-space MDP that can be solved efficiently using recent techniques in MDP planning. We then demonstrate that this gives near optimal results in most cases while achieving an order of magnitude speed-up in policy generation time.
Minimal Sufficient Explanations for Factored Markov Decision Processes
Khan, Omar Zia (University of Waterloo) | Poupart, Pascal (University of Waterloo) | Black, James P. (University of Waterloo)
Explaining policies of Markov Decision Processes (MDPs) is complicated due to their probabilistic and sequential nature. We present a technique to explain policies for factored MDP by populating a set of domain-independent templates. We also present a mechanism to determine a minimal set of templates that, viewed together, completely justify the policy. Our explanations can be generated automatically at run-time with no additional effort required from the MDP designer. We demonstrate our technique using the problems of advising undergraduate students in their course selection and assisting people with dementia in completing the task of handwashing. We also evaluate our explanations for course-advising through a user study involving students.
Focused Topological Value Iteration
Dai, Peng (University of Washington) | Mausam, (University of Washington) | Weld, Daniel S (University of Washington)
Topological value iteration (TVI) is an effective algorithm for solving Markov decision processes (MDPs) optimally, which (1) divides an MDP into strongly-connected components, and (2) solves these components sequentially. Yet, TVI's usefulness tends to degrade if an MDP has large components, because the cost of the division process isn't offset by gains during solution. This paper presents a new algorithm to solve MDPs optimally, focused topological value iteration (FTVI). FTVI addresses TVI's limitations by restricting its attention to connected components that are relevant for solving the MDP. Specifically, FTVI uses a small amount of heuristic search to eliminate provably sub-optimal actions; this pruning allows FTVI to find smaller connected components, thus running faster. We demonstrate that our new algorithm outperforms TVI by an order of magnitude, averaged across several domains. Surprisingly, FTVI also significantly outperforms popular "heuristically-informed" MDP algorithms such as LAO*, LRTDP, and BRTDP in many domains, sometimes by as much as two orders of magnitude. Finally, we characterize the type of domains where FTVI excels — suggesting a way to an informed choice of solver.
A Human-Aware Robot Task Planner
Cirillo, Marcello (Örebro University) | Karlsson, Lars (Örebro University) | Saffiotti, Alessandro (Örebro University)
The growing presence of household robots in inhabited environments arises the need for new robot task planning techniques. These techniques should take into consideration not only the actions that the robot can perform or unexpected external events, but also the actions performed by a human sharing the same environment, in order to improve the cohabitation of the two agents, e.g., by avoiding undesired situations for the human. In this paper, we present a human-aware planner able to address this problem. This planner supports alternative hypotheses of the human plan, temporal duration for the actions of both the robot and the human, constraints on the interaction between robot and human, partial goal achievement and, most importantly, the possibility to use observations of human actions in the policy generated for the robot. The planner has been tested as a standalone component and in conjunction with our framework for human-robot interaction in a real environment.