Chapter 9, Implementation of a Deep Neural Network
One problem could be that we haven't normalized our training inputs. Another could be that the training rate was too large.
With a small training rate a set of weights might converge very slowly, or not at all.
A large training rate can lead to a set of weights being over-fit to particular batch values or this training set. Also, it can lead to numerical overflows/underflows as in the first problem.