Adaptive learning rate for increasing learning speed in backpropagation networks

Jayathi Janakiraman; Vasant Honavar

doi:10.1117/12.152622

19 August 1993 Adaptive learning rate for increasing learning speed in backpropagation networks

Jayathi Janakiraman, Vasant Honavar

Proceedings Volume 1966, Science of Artificial Neural Networks II; (1993) https://doi.org/10.1117/12.152622
Event: Optical Engineering and Photonics in Aerospace Sensing, 1993, Orlando, FL, United States

Abstract

Backpropagation is a supervised learning algorithm for training multi-layer neural networks for function approximation and pattern classification by minimizing a suitably defined error metric (e.g., the mean square error between the desired and actual outputs of the network for a training set) using gradient descent. It does this by calculating the partial derivative of the overall error and changing each weight by a small amount (determined by the learning rate) in a direction that is expected to reduce the error. Despite its success on a number of real-world problems, backpropagation can be very slow (it requires hundreds of passes (epochs) through the training set). Also, its performance is extremely sensitive to the choice of parameters such as the learning rate. The mathematical considerations that go into the derivation of backpropagation require that the learning rate be as small as possible. On the other hand, in order reduce the number training epochs required to learn the desired input-output mapping, it is desirable to make the weight changes as large as possible without causing the error to increase. Also it is desirable to have an algorithm that can change the learning rate dynamically so that it is close to optimal (in terms of reducing the error as much as possible given the local gradient information) at each epoch (thereby eliminating the reliance on guesswork in the choice of the learning rate). Several authors have proposed methods for adaptively adjusting the learning rate based on certain assumptions about the shape of the error surface (e.g., quadratic) and/or estimation of higher order derivatives of the error with respect to the weights at a given point in the weight space. The primary disadvantage of such methods is that the estimation of second order derivatives is not only computationally expensive but also prone to inaccuracy due to the approximation of derivatives by discrete differences. In this paper we propose and evaluate a heuristically motivated method for adaptive modification of the learning rate in backpropagation that does not require the estimation of higher order derivatives. We present a modified version of the Backpropagation learning algorithm which uses a simple heuristic to come up with a learning parameter value at each epoch. We present numerous simulations on real-world data sets, using our modified algorithm. We compare these results with results got through standard backpropagation learning algorithm, and also various modifications of the standard backpropagation algorithm (e.g., flat-spot elimination methods) that have been discussed in the literature. Our simulation results suggest that the adaptive learning rate modification helps substantially speed up the convergence of backpropagation algorithm. Furthermore, it makes the initial choice of the learning rate fairly unimportant as our method allows the learning rate to change and settle at a reasonable value for the specific problem. As a standard against which to compare our results, we computed the quasi-optimal value of the learning parameter at each epoch. Simulation results indicate that our heuristic modification matches the performance of backpropagation with the quasi-optimal learning rate. The computational complexity of this algorithm is analysed and compared with that of standard backpropagation.

Citation Download Citation

Jayathi Janakiraman and Vasant Honavar "Adaptive learning rate for increasing learning speed in backpropagation networks", Proc. SPIE 1966, Science of Artificial Neural Networks II, (19 August 1993); https://doi.org/10.1117/12.152622

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available