Model capacity

The capacity of a model describes the complexity of the input-output relationships it can model; that is, how large a set of functions is allowed in the hypothesis space of the model. For example, a linear regression model can be generalized to include polynomials rather than only linear functions. This can be done by taking n integral powers of x as input along with x while building the model. The capacity of the model can also be controlled by adding multiple hidden nonlinear layers to the network. So, we can make the neural network model either wider or deeper, or both, to increase the capacity of the model.

However, there is a trade-off between the model capacity and the generalization error of the model:

(Left): A linear function ﬁt to the data suﬀers from underfitting. (Center): A quadratic function fit to the data generalizes well to unseen points
(Right) A polynomial of degree 9 fit to the data suffers from overfitting

Models with very high capacity can overfit the training set by learning patterns in training sets that may not generalize well to unseen test sets. Also, it can fit very well on small amounts of training data. On the other hand, models with low capacity may struggle to fit the training set:

Overfitting/underfitting in terms of training and validation loss

Table of Contents for Model capacity

Create new playlist

Sign In

Sign Up

Table of Contents for
Model capacity