In this report, we derive a non-negative series expansion for the Jensen-Shannon divergence (JSD) between two probability distributions. This series expansion is shown to be useful for numerical calculations of the JSD, when the probability distributions are nearly equal, and for which, consequently, small numerical errors dominate evaluation.
Commentary on Baum's "How a Bayesian..? I. J. Good, for example, suggested that a computation is Stuart Russell, Computer Science Division, University of California, Berkeley, CA 94720. This rules out computations that might reveal one's plan to be a blunder--OK for politicians, but Summary of the Paper not for game-playing programs. The paper divides the problem of game playing into two Part of the difficulty lies in the formulation. P(A]B) parts: growing a search tree and evaluating the possible should be independent of the form of B--i.e., any logically moves on that basis. The evaluation process is based in equivalent expression should be treated the same way-- part on the idea that leaf node evaluations should be probability distributions rather than point values, and should forms.
We describe a preliminary investigation into learning a Chess player's style from game records. The method is based on attempting to learn features of a player's individual evaluation function using the method of temporal differences, with the aid of a conventional Chess engine architecture. Some encouraging results were obtained in learning the styles of two recent Chess world champions, and we report on our attempt to use the learnt styles to discriminate between the players from game records by trying to detect who was playing white and who was playing black. We also discuss some limitations of our approach and propose possible directions for future research. The method we have presented may also be applicable to other strategic games, and may even be generalisable to other domains where sequences of agents' actions are recorded.
Now the whole point of search (as opposed to just picking whichever child looks best to an evaluation function) is to insulate oneself from errors in the evaluation function. When one searches below a node, one gains more information and one's opinion of the value of that node may change. Such "opinion changes" are inherently probabilistic. They occur because one's information or computational abilities are unable to distinguish different states, e.g. a node with a given set of features might have different values. In this paper we adopt a probabilistic model of opinion changes, de-1This is a super-abbreviated discussion of [Baum and Smith, 1993] written by EBB for this conference.
We present a method of visualizing and adjusting the evaluation functions in game programming in this paper. It is widely recognized that an evaluation function should assign a higher evaluation value to a position with greater probability of a win. However, this relation has not been utilized directly to tune evaluation functions because of the difficulty of measuring the probability of wins in deterministic games. We present the use of win percentage to utilize this relation in positions having the same evaluation value as win probability, where the positions we used were stored in a large database of game records. We introduce an evaluation curve formed by evaluation values and win probabilities, to enable evaluation functions to be visualized. We observed that evaluation curves form a sigmoid in various kinds of games and that these curves may split depending on the properties of positions. Because such splits indicate that an evaluation function that is visualized misestimates positions with less probability of winning, we can improve this by fitting evaluation curves to one. Our experiments with Chess and Shogi revealed that deficiencies in evaluation functions could be successfully visualized, and that improvements by automatically adjusting their weights were confirmed by self-plays.