Michie, D.

BOXES: An experiment in adaptive control


In brief, we believe that programs for learning large games will need to have at their disposal good rules for learning small games. Each separate box functions as a separate learning machine: it is only brought into play when the corresponding board position arises, and its sole task is to arrive at a good choice of move for that specific position. The demon's task is to make his choices in successive plays in such a way as to maximise his expected number of wins over some specified period. By a development of Laplace's Law of Succession we can determine the probability, This defines the score associated with the node N. To make a move the automaton examines all the legal alternatives and chooses the move leading to the position having the highest associated score, ties being decided by a random choice.

Strategy building with the graph traverser


I shall discuss automatic methods of search for solutions in problems susceptible of a particular formal representation, namely that on which the Graph Traverser program (Doran & Michie 1966, and see Doran p. 105) has been based. One approach, based on state-evaluation, generates all the states of the problem which can be reached in a small number of moves from the current state, and then seeks by some process of evaluation to decide which state shall form the next point of departure. In the classical studies of Newell, Shaw & Simon (1960) selection is applied by going down a priority sequence of operators, applying to each in turn a number of tests, first of applicability to the current state and then of whether the operator conduces towards one or another of various desirable intermediate states, or subgoals.