Sage Journals: Discover world-class research

Abstract

The class of networks based on the Barto-Sutton architecture are known to be capable of solving complex, multi-dimensional control problems. In these problems, the objective of the task is the localization of a system within a contiguous region of its state space.

In this work, asymptotic stability criteria are derived for the Adaptive Critical Element (ACE) of the network. Here, the weights of the network are viewed as the state of a linear time-variant state- space learning machine. For system trajectories which can be represented by simple rational polynomials, discrete-time techniques are used to analyze the stability of the learning machine. The advantages of this approach are that it both provides bounds for the learning parameters and characterizes the resultant learning behavior.

Get full access to this article

View all access options for this article.

References

Barto, A.G. , R.S. Sutton and C.W. Anderson. 1983. "Neuronlike Elements That Can Solve Difhcult Learning Control Problems," IEEE Trans. on Systems, Man and Cybernetics , SMC-13(5):834-846.

Chen, C.T. 1984. Linear System Theory and Design, Holt, Rinehart and Wmston.

Chester, D.L. 1990. "A Comparison of Some Neural Network Models of Classical Conditioning", Proc. 5th IEEE Intl. Symp. on Intelligent Control, Phil., PA, (9):1163-1168.

Jouse, W.C. 1992. "Stability Analysis of the Barto-Sutton Network Class During Localizing Control", Intelligent Enganeerang Systems through Artificial Neural Networks, Vol. 2, C H. Dagli, ed., ASME Press

Jouse, W.C. and J.G. Williams. 1991 "Neural Network Controllers: Analogs of Cognitive Structure" , AI91-Frontiers in Innovative Computang for the Nuclear Industry, (9):15-18.

Jouse, W.C. and J.G. Wilhams. 1992 "Safety Control of Nuclear Power Operations Using Self-Training Neural Networks", Nucl. Sci. and Eng. , 114(1)42-54.

Klopf, A.H. 1988. 'A Neuronal Model of Classical Conditioning", Psycho-biology, 16:85-125.

Lin, L.J. 1992. "Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching", Machine Learning , 8:293-321.

Saridis, G.N. 1977. Self-Organizing Control of Stochastic Systems New York: Marcel-Dekker.

10.

Sutton, R.S. 1988. "Learning to Predict by the Method of Temporal Differences" , Machine Learning, 3:9-44.

11.

Sutton, R.S. , A.G. Barto and R.J. Williams. 1992. "Reinforcement Learning is Direct Adaptive Optimal Control" , IEEE Control Systems Magazine , (4):19-22

12.

Tesauro, G. 1992. "Practical Issues in Temporal Differencing Learning" , Machine Learning, 8:257-277.

13.

Walter, M.D. and K.S. Fu. 1964. 'A Computer-Simulation Learning Control System" , IEEE Intl. Convention Record, 1:190-201.

14.

Walter, M.D. and K.S. Fu. 1965. 'A Heuristic Approach to Reinforcement Learning" , IEEE Trans. on Automatic Control, AC-10(4)390-398.

15.

WhiteHead, S.D. , R.S. Sutton and D.H. Ballard. 1990. 'Advances in Reinforcement Learning and Their Implications for Intelligent Control", Proc. 5th IEEE International Symposium on Intelligent Control, Phil., PA, (9):1289-1297.

16.

Williams, J.G. and W.C. Jouse. 1992. "Intelligent Software Control for Nuclear Power Plants" , Proc. IEEE Nucl. Sci. Symp. , Orlando, FL, Oct. 27-31.

Stability of the Barto-Sutton Network under Rational Control Orbits

Abstract

Get full access to this article

References