Plotting

 Rosen, Bruce E.


Adaptive Range Coding

Neural Information Processing Systems

Determination of nearly optimalt or at least adequatet regions is left as an additional task that would require that the system dynamics be analyzedt which is not always possible. To address this problemt we move region boundaries adaptively t progressively altering the initial partitioning to a more appropriate representation with no need for a priori knowledge. Unlike previous work (Michiet 1968)t (Bartot 1983)t (Andersont 1982) which used fixed coderSt this approach produces adaptive coders that contract and expand regions/ranges. During adaptationt frequently active regions/ranges contractt reducing the number of situations in which they will be activated, and increasing the chances that neighboring regions will receive input instead. This class of self-organization is discussed in Kohonen (Kohonent 1984)t (Rittert 1986t 1988).


Adaptive Range Coding

Neural Information Processing Systems

Determination of nearly optimalt or at least adequatet regions is left as an additional task that would require that the system dynamics be analyzedt which is not always possible. To address this problemt we move region boundaries adaptively t progressively altering the initial partitioning to a more appropriate representation with no need for a priori knowledge. Unlike previous work (Michiet 1968)t (Bartot 1983)t (Andersont 1982) which used fixed this approach produces adaptivecoderSt coders that contract and expand regions/ranges. During adaptationt frequently active regions/ranges contractt reducing the number of situations in which they will be activated, and increasing the chances that neighboring regions will receive input instead. This class of self-organization is discussed in Kohonen (Kohonent 1984)t (Rittert 1986t 1988).


LEARNING BY STATE RECURRENCE DETECTION

Neural Information Processing Systems

The approach is applied both to Michie and Chambers BOXES algorithm and to Barto, Sutton and Anderson's extension, the ASE/ACE system, and has significantly improved the convergence rate of stochastically based learning automata. Recurrencelearning is a new nonlinear reward-penalty algorithm. It exploits information found during learning trials to reinforce decisions resulting in the recurrence of nonfailing states. Recurrence learning applies positive reinforcement during the exploration of the search space, whereas in the BOXES or ASE algorithms, only negative weight reinforcement is applied, and then only on failure. Simulation results show that the added information from recurrence learning increases the learning rate.


LEARNING BY STATE RECURRENCE DETECTION

Neural Information Processing Systems

LEARNING BY ST ATE RECURRENCE DETECfION Bruce E. Rosen, James M. Goodwint, and Jacques J. Vidal University of California, Los Angeles, Ca. 90024 ABSTRACT This research investigates a new technique for unsupervised learning of nonlinear control problems. The approach is applied both to Michie and Chambers BOXES algorithm and to Barto, Sutton and Anderson's extension, the ASE/ACE system, and has significantly improved the convergence rate of stochastically based learning automata. Recurrence learning is a new nonlinear reward-penalty algorithm. It exploits information found during learning trials to reinforce decisions resulting in the recurrence of nonfailing states. Recurrence learning applies positive reinforcement during the exploration of the search space, whereas in the BOXES or ASE algorithms, only negative weight reinforcement is applied, and then only on failure. Simulation results show that the added information from recurrence learning increases the learning rate. Our empirical results show that recurrence learning is faster than both basic failure driven learning and failure prediction methods. Although recurrence learning has only been tested in failure driven experiments, there are goal directed learning applications where detection of recurring oscillations may provide useful information that reduces the learning time by applying negative, instead of positive reinforcement.