Transfer learning basics

To implement the car damage prediction system, we are going to build our own TensorFlow-based machine learning (ML) model for the vehicle datasets. Millions of parameters are needed with modern recognition models. We need a lot of time and data to train a new model from scratch, as well as hundreds of Graphical Processing Units (GPUs) or Tensor Processing Units (TPUs) that run for hours.

Transfer learning makes this task easier by taking an existing model that is already trained and reusing it on a new model. In our example, we will use the feature extraction capabilities from the MobileNet model and train our own classifiers on top of it. Even if we don't get 100% accuracy, this works best in a lot of cases, especially on a mobile phone where we don't have heavy resources. We can easily train this model on a typical laptop for a few hours, even without a GPU. The model was built on a MacBook Pro with a 2.6 GHz Intel i5 processor and 8 GB RAM.

Transfer learning is one of the most popular approaches in deep learning, where a model that's been developed for one task is reused for another model on a different task. Here, pre-trained models are used as a first step in computer vision-based tasks or natural language processing (NLP) based tasks, provided we have very limited computational resources and time.

In a typical computer vision-based problem, neural networks try to detect edges in their initial level layers, shapes in the middle level layers, and more specific features in the final level layers. With transfer learning, we will use the initial and middle level layers and only retrain the final level layers.

For example, if we have a model that's trained to recognize an apple from the input image, it could be reused to detect water bottles. In the initial layers, the model has been trained to recognize objects, so we will retrain only the final level layers. In that way, our model will learn what will differentiate water bottles from other objects. This process can be seen in the following diagram:

Typically, we need a lot of data to train our model, but most of the time we will not have enough relevant data. That is where transfer learning comes into the picture, and is where you can train your model with very little data.

If your old classifier was developed and trained using TensorFlow, you can reuse the same model to retrain a few of the layers for your new classifier. This will work perfectly, but only if the features that were learned from the old task are more generic in nature. For example, you can't reuse a model that was developed for a text classifier on an image classification-based task. Also, the input data size should match for both the models. If the size doesn't match, we need to add an additional preprocessing step to resize the input data accordingly.

Table of Contents for Transfer learning basics

Create new playlist

Sign In

Sign Up

Table of Contents for
Transfer learning basics