Multiclass Perceptron Implementation

Implement a Multiclass Perceptron

In this post, I will explain the working of a multilayer perceptron. We all know that perceptrons have a unit step function as an activation function. So the output will obviously be either 0 or 1. If the computed output is greater than 0 we set the outcome as 1 else 0. This is useful only when we have to classify between two labels. But how do we classify more than 2 labels using a perceptron? Things become easy only when we have two labels and not more than two. 

So in this post, I'll deal with the Iris flower data set. It has 3 flowers namely, Iris-Setosa, Iris-Versicolor, and Iris-Virginica. The tricky part starts now. We have 3 class and we have to train a "Perceptron" to classify among the three flowers. So how do we go ahead?

In order to classify between multiple classes, we will initially need to train two classes at once using the same perceptron and then repeat the same procedure with other classes as well. Let us take an example of the Iris-Flower Data set. We have 3 flowers namely Iris-Setosa, Iris-Versicolor, and Iris-Virginica. Let's begin.

The pseudo-code will be:

1. Train a Perceptron to classify Iris-Setosa and Iris-Versicolor
2. Train a Perceptron to classify Iris-Versicolor and Iris-Virginica
3. Train a Perceptron to classify Iris-Virginica and Iris-Setosa
4. Choose the maximum from each output and boom that should be your prediction among the 3 classes.

We will need two layers for a simple classification. The input will have 4 nodes. Each node will be Petal length, Petal width, Sepal length, Sepal width. The output will obviously be one flower.

The input that we have here is in form as stated below:
0.1, 0.2, 0.3, 0.4, Iris-Setosa

Lets us begin with the code part:

I am using NumPy on Python 3.7. You can use Pandas too. No restrictions here. Initially, we import the numpy module.

If you have it installed you are good to go. Else you can go to launch Command Prompt. Then type:

cd "Your Path where Python is installed without quotes" 

Then type
pip install numpy

I hope everything goes successfully. After importing is successful, we will create a perceptron class which will have functions like train to train the network and test to test the changes on the test data. The constructor will declare the variables required for the code execution. I am coding on the basis of the Pocket Algorithm with the ratchet. I am assuming that you all know about the algorithm or else I will create a new post regarding Perceptron.

Part A: The Perceptron Class

What is __init__ keyword?
__init__ is a reserved keyword. It is a constructor that is called whenever the class object is instantiated.

What is self in Python?
Self is a reserved keyword. It represents an instance of the class and can be used to access the attributes and variables within various methods inside the class.

We have 5 self variables here. The variables

1. Self.vector will store the input features as a numpy array
2. Self.weights will store the input weights for our input features
3. Self.label will store the labels for our flower data set.
4. Self.pocket_weight will store the best possible weight for our network.
5. Self.learning will store the learning rate.
6. Self.bias will store the initial bias value and will also store the updated value of the bias.

Now we will come down to the train method. In this method, we will train our network. Initially, I went with 250 epochs which gave me 93.33 % accuracy on a small dataset. We will be going ahead with the Pocket Algorithm for this classification with a ratchet. We will also test each updated weight on the training data and store the accuracy. If the new accuracy returned by the test method is greater than the "pocket" accuracy, we will store the updated weight vector in our "pocket". If the accuracy is low, we will not store that weight in the pocket and move on with training and weight update.

The pseudo-code for our Pocket Algorithm Training will be like:

Initialize Weight Vector as 0
Initialize CurrentAccuracy as 0
     Loop in Epochs:
     Test the Training data with the current weight.
     If accuracy with current weight vector is greater than CurrentAccuracy:
           Store the weight vector
           Update the Weight Vector
           Update the Weight Vector
     Output = w.x + bias
     If Output >= 0
           Output = 1
           Output = 0
     Weight = Weight + Learning_rate*(target-output)*x
     Bias = Bias + Learning_rate*(target-output)

The bias is also trained by the network except it is not multiplied with the input vector.

The pocket_weight method will return the best weight which gave us the highest accuracy. The bias method will return the final trained bias. Since the input file stores all input data of all flowers, we will have separate the flowers as explained at the start of the post. We will create 3 numpy arrays. Array x1 will store the input data for Iris-Setosa and Iris-Versicolor. The array x2 will store the input data for Iris-Versicolor and Iris-Virginica. Similarly, array x3 will store the input data for Iris-Versicolor and Iris-Virginica. The following code in the snapshot will perform the above operation of separating our datasets for individual training.

Fig. Separation on Input Data Set for individual training

We will also separate labels likewise. Label_1 will be for training set x1. Similarly, label_2 will be for training set x2 and so on.  I have initialized all the weight vectors for each set as 0. You can initialize the weight vector with a random number as well. It's entirely up to your choice but always keep a low value else oscillations will take place. The datatype for the numpy arrays is float32. 

I have also implemented the code to implement a confusion matrix for the test data as well as the training data. The variables are as follows:

TP: True Positive
TN: True Negative
FP: False Positive
FN: False Negative

Here we have 3 Perceptron objects for 3 classifications namely, percep_1, percep_2, and percep_3. percep_1.pocket_weight returns us the stored pocket weight which showed us the best accuracy on our training data of set 1. Similarly, we do this for the other sets as well. As per the above screenshot, we are training each of our 3 sets at once with their own pocket weights thus will get a value upon the dot product of test data and pocket weight. After training each set, we will select the maximum among the three sets here f1, f2, and f3. Remember, each test data will be validated within each set. We won't categorize the test data like we did for the training data. 

The argmax function will return the index of the maximum value among the numpy array. Now the question is how to predict which index is for whom.

F1 is the trained network with Iris-Setosa and Iris-Versicolor with 0 as Iris-Setosa and 1 for Versicolor. 
F2 is the trained network with Iris-Versicolor and Iris-Virginica with 0 as Iris-Versicolor and 1 for Virginica. 
F3 is the trained network with Iris-Virginica and Iris-Setosa with 0 as Iris-Virginica and 1 for Setosa

Now F1 has 1 which is maximum for Versicolor.
Similarly, F2 has 1 which is maximum for Virginica.
Similarly, F3 has 1 which is maximum for Setosa.

Now, the numpy array is arranged in the fashion: F1, F2, F3. So argmax will choose maximum among these, thus 0th index will return Versicolor, 1 will return Virginica and 2 will return Setosa. Similarly, argmin will return 0 for Setosa, 1 for Versicolor, 2 for Virginica.

Coming to the confusion matrix: If the returned value and labeled value is same then we increment the TP and TN by 1 else we will increment FP, FN by 1. False Positive lies on the vertical axis of the Confusion matrix and False Negative lies on the horizontal axis of the Confusion Matrix.

Confused? Have a look at the image below for better understanding.

To get the code for the above implementation, click the below Download Text.

Download Code
Here is the confusion matrix that I got for my test data.
Precision and Recall values are also present.

The output from the above code is:

You can get the training and test data from below gist.
Feel free if any one wants some help on the code.


Post a Comment

Popular posts from this blog

SPI Working with Verilog Code

Verilog Code for I2C Protocol

SR Flip Flop Verilog Code