Goto

Collaborating Authors

 Education


Sample Efficient On-Line Learning of Optimal Dialogue Policies with Kalman Temporal Differences

AAAI Conferences

Designing dialog policies for voice-enabled interfaces is a tailoring job that is most often left to natural language processing experts. This job is generally redone for every new dialog task because cross-domain transfer is not possible. For this reason, machine learning methods for dialog policy optimization have been investigated during the last 15 years. Especially, reinforcement learning (RL) is now part of the state of the art in this domain. Standard RL methods require to test more or less random changes in the policy on users to assess them as improvements or degradations. This is called on policy learning. Nevertheless, it can result in system behaviors that are not acceptable by users. Learning algorithms should ideally infer an optimal strategy by observing interactions generated by a non-optimal but acceptable strategy, that is learning off-policy. In this contribution, a sample-efficient, online and off-policy reinforcement learning algorithm is proposed to learn an optimal policy from few hundreds of dialogues generated with a very simple handcrafted policy.


Learning from Natural Instructions

AAAI Conferences

Machine learning is traditionally formalized and researched as the study of learning concepts and decision functions from labeled examples, requiring a representation that encodes information about the domain of the decision function to be learned. We are interested in providing a way for a human teacher to interact with an automated learner using natural instructions, thus allowing the teacher to communicate the relevant domain expertise to the learner without necessarily knowing anything about the internal representations used in the learning process. In this paper we suggest to view the process of learning a decision function as a natural language lesson interpretation problem instead of learning from labeled examples. This interpretation of machine learning is motivated by human learning processes, in which the learner is given a lesson describing the target concept directly, and a few instances exemplifying it. We introduce a learning algorithm for the lesson interpretation problem that gets feedback from its performance on the final task, while learning jointly (1) how to interpret the lesson and (2) how to use this interpretation to do well on the final task. his approach alleviates the supervision burden of traditional machine learning by focusing on supplying the learner with only human-level task expertise for learning. We evaluate our approach by applying it to the rules of the Freecell solitaire card game. We show that our learning approach can eventually use natural language instructions to learn the target concept and play the game legally. Furthermore, we show that the learned semantic interpreter also generalizes to previously unseen instructions.


Feature Learning for Activity Recognition in Ubiquitous Computing

AAAI Conferences

Feature extraction for activity recognition in context-aware ubiquitous computing applications is usually a heuristic process, informed by underlying domain knowledge. Relying on such explicit knowledge is problematic when aiming to generalize across different application domains. We investigate the potential of recent machine learning methods for discovering universal features for context-aware applications of activity recognition. We also describe an alternative data representation based on the empirical cumulative distribution function of the raw data, which effectively abstracts from absolute values. Experiments on accelerometer data from four publicly available activity recognition datasets demonstrate the significant potential of our approach to address both contemporary activity recognition tasks and next generation problems such as skill assessment and the detection of novel activities.


A Neural-Symbolic Cognitive Agent for Online Learning and Reasoning

AAAI Conferences

In real-world applications, the effective integration of learning and reasoning in a cognitive agent model is a difficult task. However, such integration may lead to a better understanding, use and construction of more realistic models. Unfortunately, existing models are either oversimplified or require much processing time, which is unsuitable for online learning and reasoning. Currently, controlled environments like training simulators do not effectively integrate learning and reasoning. In particular, higher-order concepts and cognitive abilities have many unknown temporal relations with the data, making it impossible to represent such relationships by hand. We introduce a novel cognitive agent model and architecture for online learning and reasoning that seeks to effectively represent, learn and reason in complex training environments. The agent architecture of the model combines neural learning with symbolic knowledge representation. It is capable of learning new hypotheses from observed data, and infer new beliefs based on these hypotheses. Furthermore, it deals with uncertainty and errors in the data using a Bayesian inference model. The validation of the model on real-time simulations and the results presented here indicate the promise of the approach when performing online learning and reasoning in real-world scenarios, with possible applications in a range of areas.


Active Online Classification Via Information Maximization

AAAI Conferences

We propose an online classification approach for co-occurrence data which is based on a simple information theoretic principle. We further show how to properly estimate the uncertainty associated with each prediction of our scheme and demonstrate how to exploit these uncertainty estimates. First, in order to abstain highly uncertain predictions. And second, within an active learning framework, in order to preserve classification accuracy while substantially reducing training set size. Our method is highly efficient in terms of run-time and memory footprint requirements. Experimental results in the domain of text classification demonstrate that the classification accuracy of our method is superior or comparable to other state-of-the-art online classification algorithms.


Distribution-Aware Online Classifiers

AAAI Conferences

We propose a family of Passive-Aggressive Mahalanobis (PAM) algorithms, which are incremental (online) binary classifiers that consider the distribution of data. PAM is in fact a generalization of the Passive-Aggressive (PA) algorithms to handle data distributions that can be represented by a covariance matrix. The update equations for PAM are derived and theoretical error loss bounds computed. We benchmarked PAM against the original PA-I, PA-II, and Confidence Weighted (CW) learning. Although PAM somewhat resembles CW in its update equations, PA minimizes differences in the weights while CW minimizes differences in weight distributions. Results on 8 classification datasets, which include a real-life micro-blog sentiment classification task, show that PAM consistently outperformed its competitors, most notably CW. This shows that a simple approach like PAM is more practical in real-life classification tasks, compared to more elegant and sophisticated approaches like CW.


Repairing Incorrect Knowledge with Model Formulation and Metareasoning

AAAI Conferences

Learning concepts via instruction and expository texts is an important problem for modeling human learning and for making autonomous AI systems. This paper describes a computational model of the self-explanation effect, whereby conceptual knowledge is repaired by integrating and explaining new material. Our model represents conceptual knowledge with compositional model fragments, which are used to explain new material via model formulation. Preferences are computed over explanations and conceptual knowledge, along several dimensions. These preferences guide knowledge integration and question-answering. Our simulation learns about the human circulatory system, using facts from a circulatory system passage used in a previous cognitive psychology experiment. We analyze the simulationโ€™s performance, showing that individual differences in sequences of models learned by students can be explained by different parameter settings in our model.


Social Instruments for Robust Convention Emergence

AAAI Conferences

We present the notion of Social Instruments as mechanisms that facilitate the emergence of conventions from repeated interactions between members of a society. Specifically, we focus on two social instruments: rewiring and observation. Our main goal is to provide agents with tools that allow them to leverage their social network of interactions when effectively addressing coordination and learning problems, paying special attention to dissolving meta-stable subconventions. Initial experiments throw some light on how Self-Reinforcing Substructures (SRS) in the network prevent full convergence, resulting in reduced convergence rates. The use of an effective composed social instrument (observation + rewiring) allow agents to eliminate the subconventions that otherwise remained meta-stable.


From decision to action : intentionality, a guide for the specification of intelligent agents' behaviour

arXiv.org Artificial Intelligence

This article introduces a reflexion about behavioural specification for interactive and participative agent-based simulation in virtual reality. Within this context, it is neces sary to reach a high level of expressivness in order to enforce interactions between the designer and the behavioural model during the in-line prototyping. This requires to consider the need of semantic very early in the design process. The Intentional agent model is here exposed as a possible answer. It relies on a mixed imperative and declarative approach which focuses on the link between decision and action. The design of a tool able to simulate virtual environment implying agents based on this model is discuss


Modeling the Detection of Textual Cyberbullying

AAAI Conferences

The scourge of cyberbullying has assumed alarming proportions with an ever-increasing number of adolescents admitting to having dealt with it either as a victim or as a bystander. Anonymity and the lack of meaningful supervision in the electronic medium are two factors that have exacerbated this social menace. Comments or posts involving sensitive topics that are personal to an individual are more likely to be internalized by a victim, often resulting in tragic outcomes. We decompose the overall detection problem into detection of sensitive topics, lending itself into text classification sub-problems. We experiment with a corpus of 4500 YouTube comments, applying a range of binary and multiclass classifiers. We find that binary classifiers for individual labels outperform multiclass classifiers. Our findings show that the detection of textual cyberbullying can be tackled by building individual topic-sensitive classifiers.