Market simulations, like their real-world counterparts, are typically domains of high complexity, high variability, and incomplete information. The performance of autonomous agents in these markets depends both upon the strategies of their opponents and on various market conditions, such as supply and demand. Because the space for possible strategies and market conditions is very large, empirical analysis in these domains becomes exceedingly difficult. Researchers who wish to evaluate their agents must run many test games across multiple opponent sets and market conditions to verify that agent performance has actually improved. Our approach is to improve the statistical power of market simulation experiments by controlling their complexity, thereby creating an environment more conducive to structured agent testing and analysis. We develop a tool that controls variability across games in one such market environment, the Trading Agent Competition for Supply Chain Management (TAC SCM), and demonstrate how it provides an efficient, systematic method for TAC SCM researchers to analyze agent performance.
Azevedo, Roger (McGill University) | Johnson, Amy (University of Memphis) | Burkett, Candice (University of Memphis) | Chauncey, Amber (University of Memphis) | Lintean, Mihai ( University of Memphis ) | Cai, Zhiqiang (University of Memphis) | Rus, Vasile (University of Memphis)
An experiment was conducted to test the efficacy of a new intelligent hypermedia system, MetaTutor, which is intended to prompt and scaffold the use of self-regulated learning (SRL) processes during learning about a human body system. Sixty-eight (N=68) undergraduate students learned about the human circulatory system under one of three conditions: prompt and feedback (PF), prompt-only (PO), and control (C) condition. The PF condition received timely prompts from animated pedagogical agents to engage in planning processes, monitoring processes, and learning strategies and also received immediate directive feedback from the agents concerning the deployment of the processes. The PO condition received the same timely prompts, but did not receive any feedback following the deployment of the processes. Finally, the control condition learned without any assistance from the agents during the learning session. All participants had two hours to learn using a 41-page hypermedia environment which included texts describing and static diagrams depicting various topics concerning the human circulatory system. Results indicate that the PF condition had significantly higher learning efficiency scores, when compared to the control condition. There were no significant differences between the PF and PO conditions. These results are discussed in the context of development of a fully-adaptive hypermedia learning system intended to scaffold self-regulated learning.
Recently, prediction markets have shown considerable promise for developing flexible mechanisms for machine learning. In this paper, agents with isoelastic utilities are considered. It is shown that the costs associated with homogeneous markets of agents with isoelastic utilities produce equilibrium prices corresponding to alpha-mixtures, with a particular form of mixing component relating to each agent's wealth. We also demonstrate that wealth accumulation for logarithmic and other isoelastic agents (through payoffs on prediction of training targets) can implement both Bayesian model updates and mixture weight updates by imposing different market payoff structures. An iterative algorithm is given for market equilibrium computation. We demonstrate that inhomogeneous markets of agents with isoelastic utilities outperform state of the art aggregate classifiers such as random forests, as well as single classifiers (neural networks, decision trees) on a number of machine learning benchmarks, and show that isoelastic combination methods are generally better than their logarithmic counterparts.
This dissertation presents several new methods of supervised and unsupervised learning of word sense disambiguation models. The supervised methods focus on performing model searches through a space of probabilistic models, and the unsupervised methods rely on the use of Gibbs Sampling and the Expectation Maximization (EM) algorithm. In both the supervised and unsupervised case, the Naive Bayesian model is found to perform well. An explanation for this success is presented in terms of learning rates and bias-variance decompositions.
The continuous double auction (CDA) is the predominant mechanism in modern securities markets. Despite much prior study of CDA strategies, fundamental questions about the CDA remain open, such as: (1) to what extent can outcomes in a CDA be accurately modeled by optimizing agent actions over only a simple, non-adaptive policy class; and (2) when and how can a policy that conditions its actions on market state deviate beneficially from an optimally parameterized, but simpler, policy like Zero Intelligence (ZI). To investigate these questions, we present an experimental comparison of the strategic stability of policies found by reinforcement learning (RL) over a massive space, or through empirical Nash-equilibrium solving over a smaller space of non-adaptive, ZI policies. Our findings indicate that in a plausible market environment, an adaptive trading policy can deviate beneficially from an equilibrium of ZI traders, by conditioning on signals of the likelihood a trade will execute or the favorability of the current bid and ask. Nevertheless, the surplus earned by well-calibrated ZI policies is empirically observed to be nearly as great as what a deviating reinforcement learner could earn, using a much larger policy space. This finding supports the idea that it is reasonable to use equilibrated ZI traders in studies of CDA market outcomes.