We describe a preliminary investigation into learning a Chess player's style from game records. The method is based on attempting to learn features of a player's individual evaluation function using the method of temporal differences, with the aid of a conventional Chess engine architecture. Some encouraging results were obtained in learning the styles of two recent Chess world champions, and we report on our attempt to use the learnt styles to discriminate between the players from game records by trying to detect who was playing white and who was playing black. We also discuss some limitations of our approach and propose possible directions for future research. The method we have presented may also be applicable to other strategic games, and may even be generalisable to other domains where sequences of agents' actions are recorded.
Psychological evidence indicates that human chess players base their assessments of chess positions on structural/perceptual patterns learned through experience. The learning mechanism used by Morph combines weight-updating, genetic, explanation-based and temporal-difference learning to create, delete, generalize and evaluate chess positions. An associative pattern retrieval system organizes the database for efficient processing. The main objectives of the project are to demonstrate capacity of the system to learn, to deepen our understanding of the interaction of knowledge and search, and to build bridges in this area between AI and cognitive science. To strengthen connections with the cognitive literature limitations have been place on the system, such as restrictions to l-ply search, to little domain knowledge, and to no supervised training. Although it is apparently effective to discover tactical issues by searches, isn't it dull to "forget" them immediately after use instead of "learning" something for a later reuse?
In this paper we present TDLeaf(lambda), a variation on the TD(lambda) algorithm that enables it to be used in conjunction with minimax search. We present some experiments in both chess and backgammon which demonstrate its utility and provide comparisons with TD(lambda) and another less radical variant, TD-directed(lambda). In particular, our chess program, ``KnightCap,'' used TDLeaf(lambda) to learn its evaluation function while playing on the Free Internet Chess Server (FICS, fics.onenet.net). It improved from a 1650 rating to a 2100 rating in just 308 games. We discuss some of the reasons for this success and the relationship between our results and Tesauro's results in backgammon.