Disadvantages of sliding windows

Although the sliding window is a simple algorithm, it has some downsides. The biggest downside is the low performance:

Up until a few years ago, it was OK to use the sliding window, since the features were mostly composed by hand. This meant there were fewer weights to execute.

Now, we have VGG-16, which contains 138,000,000 weights, and these millions of weights need to be executed each time the window moves a step, and we need to do this for different window sizes.

It's clear that this method is almost unusable, especially for real-time object detection.

The second drawback is that the algorithm is quite inefficient. Look at the following screenshot:

The window is moved right and down, and we can observe that the pixels are shared. When using the sliding-window algorithm, each time that we move the window, it doesn't reuse the execution from the previous movement. This means that if pixels here were already executed, when we move the window, we probably end up executing them again. Imagine the number of pixels shared over the entire image and the time and resources used to carry out this execution. 

Instead of this, we could reuse the values obtained from the previous execution. 

The third drawback is that sometimes the sliding window may not define accurate bounding boxes, as we can see in the following screenshot:

This can be resolved by using a smaller stride or using a completely different window size. Here again, we loop back to the issue where using smaller steps would be inefficient and time consuming. 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.143.22.23