Residual algorithms: Reinforcement learning with function approximation

L Baird - Machine learning proceedings 1995, 1995 - Elsevier
ABSTRACT A number of reinforcement learning algorithms have been developed that are
guaranteed to converge to the optimal solution when used with lookup tables. It is shown,
however, that these algorithms can easily become unstable when implemented directly with
a general function-approximation system, such as a sigmoidal multilayer perceptron, a radial-
basis-function system, a memory-based learning system, or even a linear function-
approximation system. A new class of algorithms, residual gradient algorithms, is proposed …