Efficient Statistical Methods for Evaluating Trading Agent Performance

AAAI Conferences

Market simulations, like their real-world counterparts, are typically domains of high complexity, high variability, and incomplete information. The performance of autonomous agents in these markets depends both upon the strategies of their opponents and on various market conditions, such as supply and demand. Because the space for possible strategies and market conditions is very large, empirical analysis in these domains becomes exceedingly difficult. Researchers who wish to evaluate their agents must run many test games across multiple opponent sets and market conditions to verify that agent performance has actually improved. Our approach is to improve the statistical power of market simulation experiments by controlling their complexity, thereby creating an environment more conducive to structured agent testing and analysis. We develop a tool that controls variability across games in one such market environment, the Trading Agent Competition for Supply Chain Management (TAC SCM), and demonstrate how it provides an efficient, systematic method for TAC SCM researchers to analyze agent performance.


The Role of Prompting and Feedback in Facilitating Students’ Learning about Science with MetaTutor

AAAI Conferences

An experiment was conducted to test the efficacy of a new intelligent hypermedia system, MetaTutor, which is intended to prompt and scaffold the use of self-regulated learning (SRL) processes during learning about a human body system. Sixty-eight (N=68) undergraduate students learned about the human circulatory system under one of three conditions: prompt and feedback (PF), prompt-only (PO), and control (C) condition. The PF condition received timely prompts from animated pedagogical agents to engage in planning processes, monitoring processes, and learning strategies and also received immediate directive feedback from the agents concerning the deployment of the processes. The PO condition received the same timely prompts, but did not receive any feedback following the deployment of the processes. Finally, the control condition learned without any assistance from the agents during the learning session. All participants had two hours to learn using a 41-page hypermedia environment which included texts describing and static diagrams depicting various topics concerning the human circulatory system. Results indicate that the PF condition had significantly higher learning efficiency scores, when compared to the control condition. There were no significant differences between the PF and PO conditions. These results are discussed in the context of development of a fully-adaptive hypermedia learning system intended to scaffold self-regulated learning.


Learning Probabilistic Models of Word Sense Disambiguation

arXiv.org Artificial Intelligence

This dissertation presents several new methods of supervised and unsupervised learning of word sense disambiguation models. The supervised methods focus on performing model searches through a space of probabilistic models, and the unsupervised methods rely on the use of Gibbs Sampling and the Expectation Maximization (EM) algorithm. In both the supervised and unsupervised case, the Naive Bayesian model is found to perform well. An explanation for this success is presented in terms of learning rates and bias-variance decompositions.


Evaluating the Stability of Non-Adaptive Trading in Continuous Double Auctions: A Reinforcement Learning Approach

AAAI Conferences

The continuous double auction (CDA) is the predominant mechanism in modern securities markets. Despite much prior study of CDA strategies, fundamental questions about the CDA remain open, such as: (1) to what extent can outcomes in a CDA be accurately modeled by optimizing agent actions over only a simple, non-adaptive policy class; and (2) when and how can a policy that conditions its actions on market state deviate beneficially from an optimally parameterized, but simpler, policy like Zero Intelligence (ZI). To investigate these questions, we present an experimental comparison of the strategic stability of policies found by reinforcement learning (RL) over a massive space, or through empirical Nash-equilibrium solving over a smaller space of non-adaptive, ZI policies. Our findings indicate that in a plausible market environment, an adaptive trading policy can deviate beneficially from an equilibrium of ZI traders, by conditioning on signals of the likelihood a trade will execute or the favorability of the current bid and ask. Nevertheless, the surplus earned by well-calibrated ZI policies is empirically observed to be nearly as great as what a deviating reinforcement learner could earn, using a much larger policy space. This finding supports the idea that it is reasonable to use equilibrated ZI traders in studies of CDA market outcomes.


Movie Recommender System for Profit Maximization

AAAI Conferences

Traditional recommender systems try to provide users with recommendations which maximize the probability that the user will accept them. Recent studies have shown that recommender systems have a positive effect on the provider’s revenue. In this paper we show that by giving a different set of recommendations, the recommendation system can further increase the business’ utility (e.g. revenue), without any significant drop in user satisfaction. Indeed, the recommendation system designer should have in mind both the user, whose taste we need to reveal, and the business, which wants to promote specific content. In order to study these questions, we performed a large body of experiments on Amazon Mechanical Turk. In each of the experiments, we compare a commercial state-of-the-art recommendation engine with a modified recommendation list, which takes into account the utility (or revenue) which the business obtains from each suggestion that is accepted by the user. We show that the modified recommendation list is more desirable for the business, as the end result gives the business a higher utility (or revenue). To study possible longterm effects of giving the user worse suggestions, we asked the users how they perceive the list of recommendation that they received. Our findings are that any difference in user satisfaction between the list is negligible, and not statistically significant. We also uncover a phenomenon where movie consumers prefer watching and even paying for movies that they have already seen in the past than movies that are new to them.