Abstract
The problem considered in this paper is how to mimic the adaptive learning control strategy called run-and-twiddle (RT) that can be observed in the behaviour of certain biological organisms. In 1984, Oliver Selfridge observed RT behaviour in E. coli bacteria, ants and male silk moths, where an organism continues its movement in a particular direction as long as the reward signal it receives from the environment has sufficient strength. The proposed approach to mimicking RT movement control behaviour has two basic forms in this paper. First, we follow a suggestion made by Chris J.C.H. Watkins in 1989 that a stopping time (strategy for determining when to twiddle) results from a a comparison of the value of the current state s with the value of the state s' in the next time step. Second, we consider various forms of actor-critic learning within the context of approximation spaces introduced by Zdzisław Pawlak during the early 1980s. Results of experiments with a digital camera-based target tracking system using reinforcement comparison introduced by Richard Sutton and Andrew Barto during the late 1990s and various new forms of actor-critic learning are reported in this paper. The contribution of this paper is a presentation of a number of biologically-inspired adaptive learning control strategies that are useful in target tracking.
Keywords
Get full access to this article
View all access options for this article.
