Build a Neural Network Classifier in 5 minutes


Hola Amigos,

I actually faced a lot of issues while building my own classifier to build my own neural network classifier. Step - by - Step explanations are hardly available coz everyone here thinks that we already know 30% approx about the subject but what about a beginner? Let's face it it's very difficult to understand these available examples, especially for new fledglings. Therefore I decided to follow the new way of learning and that is Reverse Engineering. Take an example and crack it down and then use that example as a reference to crack every other example. I did a similar thing a long time ago to learn to programme and did it again to make my own classifier that can identify a man and a woman.

To learn about CNN Click Here

The blueprint of a neural network classifier is as follows


  1. Specify a directory of your images for training
  2. Specify a directory of your images for validation
  3. Make a Convolutional Neural Network with input dimensions according to image dimensions.
  4. Add two hidden layers. Actually even 1 will work to some extent.
  5. Convert the image input to a format readable by the neural network
  6. Convert the validation input to a format readable by the neural network
  7. Set a learning rate, epochs, steps per epoch
  8. Save your model and retrain on different sets of examples and datasets 
Ok enough said. 

I am using here Keras to build this classifier. Keras is damn easy, believe it. I am pretty sure you have heard about the above steps but when the coding part comes, our mind starts cracking like, what is this "Google It", What is that "Google It".

I am using here Spyder 3.x Anaconda as my coding platform. Feel free to choose yours

Do open Anaconda Prompt first. Then type

conda pip install keras



The first step is to import the libraries which shouldn't be hard.

from keras.models import Sequential

We need Sequential to build the neural network,

from keras.preprocessing.image import ImageDataGenerator

ImageDataGenerator to convert our directory data into Keras neural network readable format. It is like when you eat Doritos, your stomach breaks it down with Hydrochloric Acid in order to process it. The intestines can only extract energy from that broken food and not directly from Doritos. I hope it is clear now :)

from keras.layers import *

Next, we need Conv2D to convert the image into arrays of data, MaxPooling2D to downsample the converted data. So the question is why do we need MaxPooling? How fast can you solve 10 variable equation? It will certainly take a long time. Similarly, greater the number of parameters, the longer will be the time taken to process the parameters. I hope this will make it clear that parameters are directly proportional to the complexity.

Then we need a dense layer to connect the layers. Dense and fully connected are different names for the same thing. This is what I realized. Dense layers take care to connect every input with every output adjoining them with a set of weights.

Click here to know about dense & fully connected layers.


Now specify the directory where you have stored the images for training and also the location for validation. Remember you just have to specify the directory containing the folders of your classes.

I had two folders namely men and women. They were located in train folder and the train folder was located in gathered_data folder. So just provide the location of the parent directory holding your classes.

Similarly, specify you validation data directory. The images for validation will remain separated from the training data in order to check the accuracy of our NN.

Now specify your epochs and batch size. One epoch is 1 forward+back propagation. Batch size is the number of samples that will be sent at a time to the network.

Now specify the height and width of the image. One can relax as Keras helps us to resize every image into the required size. Basically, small size images have fewer parameters thereby reducing the complexity of the neural network.

Now we will build our neural network. The syntax model = Sequential() calls the Sequential API which has the methods of convolution and pooling.

model.add(Conv2D(32, (3, 3), input_shape = (width, height, 3)))

model.add adds a layer in the network. Conv2D is the convolution layer method. so model.add(Conv2D) adds a convolution layer. (32, (3, 3)) adds 32 output filters with stride size 3 X 3.

Click here to know about filters.

input_shape is the shape of the image that we are providing the input. Do remember that only this convolution layer has this image input. All other layers have the input connected to the output of the previous output. In the syntax (width, height, 3) we know about width and height but what about the '3' here. It is the number of channels here. Every image is in RGB model. So we have three colors hence the third dimension is 3.

Keras has two methods. The first is channel_first which means the input_shape will be (3, width, height)
The second method is channel_last which means the input shape will be (width, height, 3). I guess English is enough to understand this. So we are following channel_last method.
Next, we will choose our activation formula. We all started with sigmoid but we can choose other methods too provided by Keras. I am choosing relu as my activation function. So after the update and summation of wights and data, it will be passed through the activation function. After passing through the activation function we will minimize some parameters by pooling. Thus I have used MaxPooling here. Average pooling is also available here but accuracy and output of Max Pooling are enhanced with Max rather than with average pooling.


The size of pooling layer is 2 X 2.

Click here to know about Pooling

Similarly, I have added a third layer too. So a quick way through will be like this


Image Credits - Mathworks

So it goes like this


Th last line compiles the neural network model into a single package with extra parameters like the loss, optimizer, and metrics.
Summary of what we did till now -

  • We imported the libraries of Keras for convolution, sequential and image generator
  • We specified the whereabouts i.e directory of our images for training as well as validation
  • We made our neural network design.
Coming to the term loss = 'binary_entropy'. Binary Entropy is the loss function. Let us consider x as the actual answer and y as an ideal answer. So the loss can be deduced by the simple formula
Loss = x - y
Clearly, the lower is the loss, the better our network gets. So in order to minimize the loss, the CNN has to deduce a function to minimize the loss of maximum efficiency. Smaller values of cost function point to better network fit and vice-versa. Binary entropy is modeled on a variable which can have only two values, 0 and 1. So if the probability of 1 is 0.4 then the probability of 0 will certainly be 0.6. The binary entropy follows a graph like below

Fig - Binary Entropy Function

We are using binary entropy here because we have two classes. The result has to be either a man or a woman. Hence the probability will either flow towards man or woman. One must remember that binary cross entropy is a special case for categorical cross entropy where if you are having two classes then you are using the binary cross entropy form of categorical cross entropy.

Optimizer depends on the neural network density. For deep networks, Adam or root mean square is used. Since we are having 2 hidden layers with a good number of neurons, I would stay with Adam. 

Coming to the final part of the code.


inp_data = ImageDataGenerator(rescale = 1./255, shear_range = 0.2, zoom_range = 0.2)

Image Generator will rescale the image. The shear range will shift the image by a factor of 0.2 and will zoom the image by a factor of 0.2 in order to provide vibrant data and prevent overfitting.
Similarly, the validation data is also rescaled and flipped horizontally randomly for better fitting. 
You can use various features for this too.

Now to process the image from the directory and passing it through data generator we use flow from directory. It will grab the images from the directory and apply inp_data process on it. 
Carefully see the line 4 of the above image. The function flow from the directory will require the input image directory, the target size to convert image size, batch size, and the class. After conversion of the input images, it will be stored in input_data. Similarly, valid_data will store the validation data. 

Finally, the CNN model is compiled by passing the input data, epoch size, valid_data, steps per epoch.
Steps per epoch are equal to the number of samples divided by the batch size. Similarly, validation_steps is equal to the number of validation samples by batch size.
Batch size is the number of samples taken at once inside the NN.
One forward propagation and one backward propagation of every example is equal to one epoch.
Iterations are the number of passes required to complete 1 epoch.

Example Take 100 images and batch size is 50 then it will take 2 iterations to complete 100 images which is equal to 1 epoch.

Convolutional Neural Networks IV



In this post, I will deal with back propagation, gradient descent etc. Do check out my previous posts regarding Max Pooling, filters, dropout layers, fully connected layers and CNN. To begin click here.
So

What is Back Propagation?

Backpropagation is done whenever we find some error. After obtaining the probabilities of the images the CNN subtracts the actual answer with the obtained answer. Clearly, this is an error. So our first objective is too reduce this error. While I was working with the probabilities with the image of the letter 'R' (click here to know about the image 'R'), I got the probability of R as 0.911 and the probability of some other is 0.42.

Actual probability of getting a 'R' is 1 thus, the error is |1 - 0.911| = 0.081
The actual probability of not getting a 'R' is 0 thus, the error is   |0 - 0.42| = 0.42.
Thus the total error is 0.501. Now comes the power of weights. These weights are used in the gradient descent optimization algorithm which is used to find the minimum of the function. These weights are kept on adjusting along the gradient descent and each and every time the slope is chosen which produces the minimum error (like here we have 0.501). This slope is then fed in the equation which is evaluated to find the weights and that weight is multiplied with the input coming from every feature of the neuron (values) and then added with the bias to get the final value. This whole operation is performed several times to update the weights with appropriate values and updated within the network.

Consider the following image below
Fig 1- Gradient Descent
The red dot moves across the graph and tries to find the minimum error with the weight axis. This is performed for every neuron pair and their weights are then noted down. After finding the appropriate weights for every neuron pair, these weights are updated with their corresponding neuron pair and the process of NN is carried out again and again. 

This gradient function is also known as the cost function if you are aware of that. Our main objective for that cost function is to minimize. In gradient descent, we jump or take steps towards negative side i.e downwards. The opposite is the case of gradient ascent where we jump towards up to find the maximum. Well, the good thing is that these objectives are executed by the NN itself automatically to gather the best parameter but there are also various parameters where it cannot be automatically done. These include the number of hidden layers, choosing the activation function, the right amount of stride, choosing the matrix size of the pooling layer, number of epochs, batch size and various other parameters.

All these can be achieved by trial and error as I do the same way to find my best NN. Although this hurts because to run a NN my Core i3 CPU (3rd Gen 😑) it takes around 3 hours to find the appropriate parameters. Well, hard works do pay off so keep on trying and explore the world of NN.

Cheers 😃

Convolutional Neural Networks III


In this post, I'll deal with dense layer, fully connected layer and backpropagation. If you have missed my previous post click here.
Before moving further, let us have a view on the filter's working on an image. I made a pixelated image of the letter 'R' and applied a 3 X 3 filter one time and 3 times. One must remember greater is the number of times a filter is applied on the same image, lessened will be the features. The pixelated image of the letter 'R' and the filter is below

         
Fig 1- Filter (Left) and the image (Right)



So when this filter is moved across the image we get feature extracted according to the filter like this below
Fig 2- Filtering applied on image(once and twice)

One can clearly see how the filter has faded the pixels that aren't in phase with the filter. The term 'phase' seems to fit here 😊. The darkened cells are ones in phase with the filter. Even if the reference image is a bit distorted or rotated or flipped or sheared the feature will still get picked up as it will be in phase with the filter.

The process is same for the filter for calculating the filtered value. It is the sum of the products of the value of 1s and 0s with the actual cell value. The sum is then divided by the total number of cells in the filter matrix (Here 9). The value gets faded if its closer to 0 and it gets darkened if it gets closer to 1. 

Rectified Linear Units ( RELU)


Now this filteres image is passed through rectification layer. We got the maximum value of cell in the below image is 0.33 and minimum value of cell as 0.11 so the middle value is 0.22. So the rule is stated as change the values of cells to 0 whose values are less than 0.22 (you can choose different values. Value closer to the maximum will be hard and fast process for the image and choosing a value closer to minimum will include some unnecessary features). So after passing the second image from Fig 2 above through the RELU layer we get an output like this

Fig 3- RELU layer applied to the filtered layer
After the application of RELU layer the image is passed through the pooling which I have already discussed in this post. So an overall process would be like this
  1. Convolution (Filtered)
  2. Relu activation
  3. Pooling
The above process is done once or twice or can be even greater. 

What is a fully connected layer?

A fully connected layer is a single row of all the neurons connected together where every cell indicates the probability towards the actual answer. In the rectified image of 'R' darker cells have a greater value while the faded cells have a low value, thus one can conclude that the darkest cell has a greater probability to be in phase with that filter and cell having a faded value or having a lower value has quite less probability to be in phase with the filter. The above rectified image is the filtered output of a single filter and we have various filters that extract out the features so in each filter. 

Now every cell value is laid down in form of an array. This process is carried out for every filter. Now an average is taken among every dark cell. Similarly average is also taken for every faded cell. These average indicate the probability. 
The average of the darkest cells from each filter shows the probability of how close is our image (to 'R' here). On the contrary the average of all the lighter cells from each filter shows the probability of how far is our image (from 'R' here)
  
Fig 4- Fully Connected Array of all neurons (cells)
Like the above image when all arrays obtained from each filter is connected we get our fully connected layer. 

What is a dense layer?

A dense layer is just another name of fully connected layer. Similar operations take place in dense layer where every neuron is connected with each other. It is also called dense because it represents a dense connection of dense neurons. A dense layer has weights associated with every neuron pair and with unique values. Generally in Keras you may notice dense layer when working with CNN while in Tensorflow you may find fully_connected. So do not get confused with this. In keras we often use dense as Dense(10), Dense(1). Here every neuron among 10 neurons is connected with the last neuron and with unique weights. Since this is too dense and I don't think it's harmful to call it a dense layer. 😉

What is a dropout layer?

As per the name suggests, it dropouts or better say, it eliminates some of the activated cells (cells passed through activation layer). This has to be done in order to prevent over-fitting. An over-fit network will not be able to distinguish features from different image of the same object. The CNN has to work within a robust environment hence dropout becomes necessary. Dropout is basically chosen between 0.2 to 0.8. Dropout removes the  neurons randomly based on the parameters provided by the user like 0.4 etc.

Consider an example where you have 20 cookies and 8 of em are halftimes. An overfit network will only recognize the fully circle cookies. A fledgling CNN will pick up almost every cookie among the 8. With dropout some cookies which are either circular or halftimes will be dropped a.k.a removed randomly and retrained. This increases the quality of the network.


Cheers, See ya all soon. 

Convolutional Neural Networks II


This is my second post in CNN regarding max pooling, strides and padding.

In the previous post we extracted the features from the image of '3'. Although the dimensions of the image were 4 X 24 which is quite small. But what to do when the size of the image is big having a very high resolution. Greater is the size of image more are the parameters which the CNN has to extract from the image and hence it will take a longer time to identify the class of that image so how do we reduce the size of an image without losing any details from it and also maintaining the spatial arrangement. The only thing that comes at this position to solve this problem is Max pooling or Average pooling. Pooling is basically a technique by which you reduce the size of the image but also maintain the details and features within it in order to lessen the parameters. However, with pooling in the images might guesstimate updated started but still gets picked up by this CNN so that good news

Max Pooling

In Max pooling we choose the maximum value within a matrix. The size of the matrix could be 2 X 2 or 3 X 3 also. Here is an image showing the max pooling of the reference image of 3


Fig - Max Pooling of reference image of '3' with stride 1

As you can see with max pooling size of the resultant image gets reduced and also retaining the image information.
 Here is an explaining the process of max pooling.
I have used 2 X 2 size pooling that extracts the maximum value of cells. The size also gets reduced.


Initial Size of matrix 4 X 6. Pooled image is 3 X 5. Notice that the stride here is 1.

 What is Stride?

Stride is the number of steps that the pooling matrix will jump. Notice in the GIF image above, the matrix jumps only 1 step horizontally as well as vertically. 

Here is an example showing max pooling with stride 2

Fig - Max Pooling with stride 2
Another example with stride 2


Average Pooling

Average pooling does the same job like max pooling but instead of calculating the maximum value, it takes the average value within the matrix. 

Fig - Average Pooling with stride 1

Average Pooling retains a bit less information when compared with the max pooling. Average Pooling is somewhat less accurate than max pooling.


Padding

We noticed that with pooling our parameters has been reduced as well as the image size. It is important to maintain the spatial arrangement of the image. So to lessen the parameters and also maintaining the image size we use padding.
Padding is the technique in which we add a row and a column on zero around the image matrix. When this image is pooled it retains the information with the same size and fewer parameters.

Fig- Max Pooling with padding to retain the image size

Another example showing Average Pooling with padding and stride equal to 1
Fig - Padding with Average Pooling

✌ Image Size Retained!!! 😅

In the next post, I will deal with fully connected layer, dropout, and dense layer.



Convolutional Neural Networks I


Every time I imagine CNN something spills out from my brain and forces me to restart my learning. I guess it was because I wasn't doing practicals on CNN. Many guys basically look on CNN as a theory and that is where even I lost my way of learning. However, Coursera, Edx, and Udacity helped me to amplify my knowledge about CNN and those big words like pooling, strides etc. I am not a genius in CNN but yes I know something about CNN.


So What is CNN?

Convolution Neural Network is a branch of AI where features from images are gathered up and compared with the input data. It is basically a voting system where every pixel votes for the outcome and as usual the one with maximum votes win in this game and we get a result like this


So how does this happen is what comes in my mind first.

A CNN takes an image as input and converts them into arrays. Yes those numpy arrays are for the same !! Do remember that a CNN never matches the whole image instead it matches small features of images with the input image.
So let us pass an image of a number '3' to our CNN. The CNN will look at the image like this


The human eye can clearly see the digits 3 being displayed. However, it's not so easy for the machine to see the image. Now to extract out the details this CNN will multiply a weight Matrix to the above mattress that represents the number 3. After multiplying the weight Matrix the result will be like this

Fig - Reference Image of 3 to classify


Comparing both the images side by side you will recognize that the following image represents more features and detail than the previous image

 Every cell is multiplied with a weight which enhances the features of the image for better recognition. Now in my post of Image Classification, I used the syntax
 model.add(32, (3, 3), input_shape = (3, width, height))).

Here we have 32 filters of size 3 X 3. 

 *** To clear out the confusion I will say it again. The CNN has been trained on a variety of images of 3. It now knows the features so what CNN does is that it will grab a filter with "trained" details and will match "that with the validation data" i.e the above image of '3' is an image that we want to check. The network hasn't seen this image before. The filter contains the cell data from the trained image. ***

So here I have taken a 3 X 3 filter that knows the feature of '3' which will repartition the cell by a rule which state mark all cells 1 whose value is greater than 128 and 0 whose value is less than 129. Thus our filter belonging to the top left corner will represent like this


Now, this filter will move across every part of the reference image ('3') matrix to match and find the feature pixel by pixel.
So how it's done?
Each cell content from the filter will be multiplied with the similar cell in the image that we want to classify. 

After multiplication and addition, we get something like this


 Moving this filter at every step across the reference image we get the following information 




Voila !. The filter matches itself with every part of the reference image and outputs the probability of value in each cell. Do notice that the fourth column of reference 3 image has been neglected by the filter in the fourth column of the filtered image. It is because our filter is 3 X 3 thus square in shape and every 3 X 3 matrix in the reference image has to match this filter whereas the fourth column doesn't. Similarly, we place 32 filters across the reference image and extract out the features. Every filter will be from a trained image and it will move across the reference image and compare each pixel. The pixel which passes through the filter gets a higher probability thus, it gets darker. On the contrary, the pixel which cannot pass the filter is lightened. 
We can clearly see how the red circled pixels match the reference image and also the filter. 


After applying 3 more filters you will get this as the output
This methodology of stacking various filters containing a bunch of features in them, over an image is called a Convolution Layer. Thus each image is a stack of various filtered images. Moving the filter across the whole image, we get the information about the location of the pixels. 

Coming to the input shape of the image, the input shape is represented in Keras as (3, width, height) or by (width, height, 3). So to find the width just calculate the no. of pixels horizontally of the reference 3 image. You will get 4. Similarly calculate the no. of pixels vertically. You will get 24. Now each pixel is represented in form of RGB. Thus, it has 3 channels. So our input shape for the CNN will be (3, 4, 24) or (4, 24, 3). 

I will deal with Max Pooling, Padding and Strides in the next post.

Cheers. 🙂



Deep Reinforcement Learning -Write an AI to play Pong with Q learning



In this post, we will implement Q learning to play Pong.
By the end of this post, you will be able to


  1. Design your own game in Python Pygame library. 
  2. Learn the basics of Q learning
  3. Implement an efficient Policy for the agent



Important
To follow this tutorial it is highly recommended to have even a little bit of experience in

  1. Python
  2. Backpropagation 
  3. Linear algebra 
  4. Matrices. 

If you know the basics of these then we can move on.

I am using Python 3.5 and the software I am using for the coding part is Sublime Text 3 but you can even use the default Python IDLE editor

Before starting we need to install the pygame library. To do that just open the Python folder where it is installed then go to the scripts folder, and open command prompt from that location.

Now type this below

  pip install pygame 

Let it download first then type

  pip install numpy 

Let's go to the problem solving

The pong game basically has a rectangular bar with which we will have to bounce the ball everytime it tries to hit. If it misses then the reward will be -1 else + 1


from pygame.locals import *
This imports all the packages from the pygame library

import numpy as np
This imports the numpy library and renames it to 'np' for easy coding.

import pygame as pg
This imports the pygame library and renames it to 'pg' for easy coding.

import random
This imports the random library inorder to generate some random numbers.

import time
This imports the time library which I will use here to calculate the time taken to learn from experience.



start = time.time() 
The variable 'start' is storing the initial time at which the script was loaded.

FPS = xxx
A high value of FPS will make the game faster and a low value will make the game slower in terms of frames. Having a high FPS will make your agent learn in less time in case you lack patience ;)

fpsClock = pg.time.Clock()
It creates an object which keeps an eye on the time of the system.

pg.init() 
This initializes the pygame module

window = pg.display.set_mode((800,600))
It will create a window container with height 800 pixels and width 600 pixels. Change according to your desire.



pg.display.set_caption('Q learning Example')
It will display 'Q learning Example' on the title bar

Left = 400
The co-ordinate of the left surface

Top = 570
The co-ordinate of the top surface

Width = 100 
Width of the rectangular bar

Height = 20
Height of the rectangular bar

LR = 0.01
Y = 0.99
Learning Rate and Gamma

Black, White, Green
RGB values of black white and green colour

rct = pg.Rect(Left, Top, Width, Height)
It creates a rectangular object from the pygame library and stores the coordinator as specified by the left, top, width and height.


storage = {}
It will store the value of each state.

action = 2
It defines the action of the agent. 2 stands for right 1 stand for left and 0 stands for rest

jumpY = 6
jumpX = 8
Number of pixels the agent will jump to the horizontal x-axis and according to the vertical y-axis

Q = np.zeros([25000, 3])
This creates a numpy array with 25000 rows and 3 columns. Each of the three columns define the action and each of the row defines the state. Each column stores the maximum Q value respective to the action according to the state

cenX = 10
cenY = 50
radius = 10
score = 0
missed = 0
reward = 0
CenX and CenY will store the coordinates of the centre of the circle. Radius for radius of the circle and rest is for the score, reward and the number of times the rectangular bar has missed the ball as 'missed'.


The calculate_store function will calculate the reward and return 1 if the ball is on the rectangular bar or else it will return -1 if the rectangular bar fails to deflect it. Whenever the rectangular bar message the ball the game will regenerate the ball at random location and that random location specifically for the x-axis is determined by the newXforCircle function.



The class state stores the location of the rectangular bar it consists of the general information about its coordinates and also of the coordinates of the circle. The class Circle stores the coordinates of the circle centre of the circle.


The convert function will convert the state into a number and this number will be stored as the index in the numpy array Q among the 25000 rows. The max function returns the index of the maximum value present in that storage.


The action function returns the index that contains the maximum value of a particular action (0, 1, 2) for the agent. The argmax function will return the indices of the maximum values along a certain axis. The afteraction function intakes in the current state and the action that has been taken on that state and returns the next state. For example, if the rectangle's coordinate is 200 on the x-axis and the action is 2 to move right then int the next state it will be 200 + 100 which is 300.



The newRect function will return a new rectangle with updated coordinates based on the current action taken. If the rectangle is at the edge of the right border of the window (800) then it will return the original rectangle else it will return an updated rectangle that has moved 100 pixels to the right. Similarly, if the rectangle is at the edge of the left border of the window (0) then it will return the original rectangle or else it will return an updated rectangle

Quite Simple isn't it? :)

Now coming to the training and the infinite loop part. Hold your horses for it's a bit long.



#The for loop at line 2 must be present, whenever you are making a game using Python
#library np.savetxt(), which saves the Q values matrix. COLL stores the random
#RGB values of the ball which will change whenever the ball will strike the
#rectangular bar.
#Window.fill() fills the entire window with a certain RGB colour value
# The If-else loop describes the action that will be taken whenever the ball hits
# any of the edges. It includes the top, bottom, left side (0 pixels) and right
# side (800 pixels). It basically defines the behaviour of the ball a.k.a how it
# should jump and in which direction it will jump by updating the values of the
# rectangle and the circle a.k.a by calling the respective functions
#The Q function is the engine that is working here it is the most important
# part that one must cover during Q learning the equation of Q learning
# follows Bellman equation of probability.

#It States


Q(s, a) = Q(s, a) + lr*[R + y*max(Q(s', a')) - Q(s, a)]


# where Q(s, a) is the current state
# lr is the learning rate
# y is the gamma
# R is the immediate reward of that action
# s' and a' represent the next state and it action

   
Take an example where the rectangle coordinates are


Left = 400 Top = 400 Height = 30 Width = 100

This will be stored in the state class in the self.rect variable. Similarly, the centre coordinates of the circle will be stored in self.circle variable in the class state. Then this state is converted into a number i.e each state is assigned a number.This number is the index in the Q table. Hence whenever the agent faces certain state which is already in the Q table, it will then calculate the argmax of that row and return the index with maximum Q value. The action (Q table column) having maximum value gives the agent information about the reward it has yet received in that state by taking that action. So it is pretty easy to understand that the maximum value reflects the maximum reward with that action.

For the full code click here

Cheers,
Eva :)