Detecting Objects using Machine Learning I
Detecting Objects using YOLOThis post deals with my small project on YOLO. It is a great project which if linked with an Arduino will certainly make you win Google Science Fair. Pardon 😁
It also enables to localize the object. If you lost your specs then maybe this will certainly work.
So YOLO stands for "You Only Look Once". Yes, YOLO looks at the image only once. It works by dividing the image into K x K cells
A bit like this
Fig 1 - Image divided into cells
Before working on YOLO have a look at its output like this when I ran an edited version of YOLO over the above image.
Fig 2 - YOLO Output using classification.
Each of these yellow boxes is called bounding box in YOLO language. Each cell in Fig 1 will generate bounding boxes. Treat each image cell as an individual cell and a CNN is run over that image to extract out the features from it. If that feature is significant then a bounding box is drawn over that portion of a particular cell with a bounding information or confidence score.
Bounding information is the thickness of the bounding box. More significant items get a thicker boundary box. On the contrary, less significant items get a thinner boundary. When these cells are merged then all boundary boxes with approximately same boundary information gets converted into a bigger boundary box called as boundary box group. However, this boundary box does not classify any object. It just provides the significance score. This process continues and the result is the Fig 2 or it can even be Fig 3 below.
Back to work now.
If you are imagining the boundary score then prefer the below image.
Fig 4 Confidence Score a.k.a Boundary Information
The boundary boxes with a higher score are used for classification. So first we find out whether there is a boundary box present, second, it predicts the class of the information inside the bounding box. YOLO can detect up to 20 different objects. Some of em are dogs, person, cars, traffic lights etc.
Now YOLO combines the results of image classification and marks the boundary group which contains the complete object. After this, only those boxes are kept whose box information is highest i.e it represents a full object inside or at least more than 80%. Rest other insignificant boxes are removed.
The result is in Fig 2.
Every bounding box has 5 parameters namely x, y, w, h and its score. x and y is the center of the boundary box within the cell. w and h is the width and height of the boundary within the cell respectively. So if we feed an image in to Tensorflow, we would get
K * K * (B * 5 + C) tensors. C is the total number of classes. K is the number of cells.
To begin our own we need
- A long nap
- Microsoft Visual Studio 2k15 Click here to download or here.
- OpenCV 3.0
- CUDA 8.0 Click here to download
If you have the 2nd item then I guess you can move over the 1st item or else put the 2nd item on download and follow the 1st item.
Install Visual Studio in "custom" mode. Then select Visual C++ in programming languages and also Common Tools.
Then clone the following repository.
Extract the folder in the default python folder of your OS. My extracted folder name was "Darkflow-masters". Do check yours. It may differ
Now open Command Prompt as admin and cd to your scripts folder. The script folder lies inside Python folder. My python folder's location is E:\PyPy so to cd there I opened CMD as an admin. The below screenshot shows how to reach to your scripts folder.
pip install cython
pip install tensorflow
After the above process, locate your darkflow folder you just extracted a few moments ago using cd.
python setup.py build_ext --inplace
Since my python.exe was not in the same location as setup.py (lies inside darkflow folder) therefore I typed the following. (Yours will differ)
E:\PyPy\python setup.py build_ext --inplace
Somewhat like this
Now to run YOLO we have the syntax
A. To run on CPU (Yeah it sucks!)
python flow --imgdir (Sample Image directory) --model (cfg directory) --load (weights directory)
B. To run on GPU (Life is sweet. More sweet if GPU is Nvidia)
python flow --imgdir (Sample Image directory) --model (cfg directory) --load (weights directory) --gpu 1.0
Command I used on my computer. Of course yours will differ
E:\PyPy\python flow --imgdir sample_img/ --model xx/tiny-yolo-voc.cfg --load xx/tiny-yolo-voc.weights --gpu 1.0
Here are my outputs
Error - cl.exe not found
Solution - Re-install Visual Studio
Error - AssertionError: Over-read bin/yolo.weights
Solution - You have used the wrong cfg with wrong weights.
Error - ImportError: No module named 'darkflow.cython_utils.cy_yolo_findboxes'
Solution - Use this command setup.py build_ext --inplace as I have instructed.
Error - No cv2 module found
Solution - pip install opencv-python
Error - AssertionError: expect xyz bytes, found abc bytes
Solution - You used the wrong cfg with the wrong weights again!