Skip to main content

Convolutional Neural Networks II


This is my second post in CNN regarding max pooling, strides and padding.

In the previous post we extracted the features from the image of '3'. Although the dimensions of the image were 4 X 24 which is quite small. But what to do when the size of the image is big having a very high resolution. Greater is the size of image more are the parameters which the CNN has to extract from the image and hence it will take a longer time to identify the class of that image so how do we reduce the size of an image without losing any details from it and also maintaining the spatial arrangement. The only thing that comes at this position to solve this problem is Max pooling or Average pooling. Pooling is basically a technique by which you reduce the size of the image but also maintain the details and features within it in order to lessen the parameters. However, with pooling in the images might guesstimate updated started but still gets picked up by this CNN so that good news

Max Pooling

In Max pooling we choose the maximum value within a matrix. The size of the matrix could be 2 X 2 or 3 X 3 also. Here is an image showing the max pooling of the reference image of 3


Fig - Max Pooling of reference image of '3' with stride 1

As you can see with max pooling size of the resultant image gets reduced and also retaining the image information.
 Here is an explaining the process of max pooling.
I have used 2 X 2 size pooling that extracts the maximum value of cells. The size also gets reduced.


Initial Size of matrix 4 X 6. Pooled image is 3 X 5. Notice that the stride here is 1.

 What is Stride?

Stride is the number of steps that the pooling matrix will jump. Notice in the GIF image above, the matrix jumps only 1 step horizontally as well as vertically. 

Here is an example showing max pooling with stride 2

Fig - Max Pooling with stride 2
Another example with stride 2


Average Pooling

Average pooling does the same job like max pooling but instead of calculating the maximum value, it takes the average value within the matrix. 

Fig - Average Pooling with stride 1

Average Pooling retains a bit less information when compared with the max pooling. Average Pooling is somewhat less accurate than max pooling.


Padding

We noticed that with pooling our parameters has been reduced as well as the image size. It is important to maintain the spatial arrangement of the image. So to lessen the parameters and also maintaining the image size we use padding.
Padding is the technique in which we add a row and a column on zero around the image matrix. When this image is pooled it retains the information with the same size and fewer parameters.

Fig- Max Pooling with padding to retain the image size

Another example showing Average Pooling with padding and stride equal to 1
Fig - Padding with Average Pooling

✌ Image Size Retained!!! 😅

In the next post, I will deal with fully connected layer, dropout, and dense layer.