How to Create Object Recognition in Python

Return to site

How to Create Object Recognition in Python

How to Create Object Recognition in Python 

Python is one of the most promising languages to bring artificial intelligence to life. In this article, I will try to explain how to create object recognition using Python and ImageAI.

One of the most promising sciences about computers and programs is computer vision. Its meaning lies in the PC's ability to recognize and determine the essence of a picture. 

This is a crucial area in artificial intelligence, involving several actions at once: recognizing the content of a picture, identifying an object, and classifying or generating it. Searching for objects in a picture is probably the most important area of computer vision.

Identification of things or living beings in a photo is actively used in the following areas:

Car search;
People recognition system;
Pedestrian search and counting;
Security enhancement;
Creation of unmanned cars, etc.

Today it is possible to develop many methods for object search, which are applied depending on the target area. In this area, as in other uses of IT technology, a lot depends directly on the programmer. It is a great tool for creativity, with which the "creation" can get its intelligence. How to use the intelligence of the program already depends on the creative thinking of the developer.

The technology turned the idea of artificial intelligence upside down. It later became the basis for the following methods R-CNN, Fast-RCNN, Faster-RCNN, RetinaNet. Among them are high-precision, fast methods - SSD and YOLO. The application of these algorithms, which are based on deep learning, requires a deep knowledge of mathematics and a thorough understanding of frameworks.

Let's Begin 

Consideration of tips should begin with the ImageAI functional library, written in Python. This framework makes it easy to integrate innovative achievements in the field of computer vision into already developed or new programs.

Installing Python

First of all, you must install Python 3. All you need to do is download the file from the official site and run the installation process.

Creating Dependencies

Now it is time to install dependencies using pip. The principle of the command is simple: pip install and the name of the library (the main frameworks are described in the list below). What it looks like:

pip install tensorflow # installs the Tensorflow software environment.

What frameworks to add:

Numpy;

SciPy;

OpenCV (opencv-python);

Pillow;

Matplotlib;

H5py;

ImageAI

You can see all the frameworks and the commands to install them on the official ImageAI documentation site.

Retina Net

Now it is worth downloading the file for the Retina Net model. It is involved in the process of identifying objects in images.

Once the dependencies are installed, it is already possible to write the first lines of code to calculate the objects in the images. The file FirstDetection should be created. Into the created file you need to insert the code from the next section. You also need to copy the file from the Retina model and add the picture to be processed to the folder with the Python file.

Testing

Then you should create a file and place the following code in it:

from imageai.Detection import ObjectDetection

import os

exec_path = os.getcwd()

detector = ObjectDetection()

detector.setModelTypeAsRetinaNet()

detector.setModelPath(os.path.join( 

exec_path, "resnet50_coco_best_v2.0.1.h5")

)

detector.loadModel()

list = detector.detectObjectsFromImage( 

input_image=os.path.join(exec_path, "objects.jpg"), 

output_image_path=os.path.join(exec_path, "new_objects.jpg"), 

minimum_percentage_probability=90, 

display_percentage_probability=True, 

display_object_name=False

)

It remains to run the code and wait for the results to appear in the console. Next, go to the directory where the FirstDetection file is installed. A new picture or several pictures should appear here. To better understand what happened, you should open the original and the new picture.

It's Time to Consider The Principle of The Code

from imageai.Detection import ObjectDetection

import os

exec_path = os.getcwd()

Line description:

Line 1: porting ImageAI and class to find the item;

Line 2: import Python os;

Line 4: create a variable that specifies the path to the directory with the Python file, RetinaNet, model, and image.

detector = ObjectDetection()

detector.setModelTypeAsRetinaNet()

detector.setModelPath(os.path.join( 

exec_path, "resnet50_coco_best_v2.0.1.h5")

)

detector.loadModel()

list = detector.detectObjectsFromImage( 

input_image=os.path.join(exec_path, "objects.jpg"), 

output_image_path=os.path.join(exec_path, "new_objects.jpg"), 

minimum_percentage_probability=90, 

display_percentage_probability=True, 

display_object_name=False

)

Line description:

1 line: declaration of a new class for finding objects;

2 line: setting RetinaNet model type;

3 line: specifying the path to the RetinaNet model;

Line 6: loading the model inside the class for searching;

Line 8: call the detection function (object recognition) and start parsing the path of the initial and final images.

ImageAI has support for a lot of different settings for finding objects. For example, you can configure to retrieve all found objects during image processing. The search class is capable of creating a separate folder named image, and then retrieving, storing, and returning an array with the path to all objects.

list, extracted_images = detector.detectObjectsFromImage 

(input_image=os.path.join(execution_path , "objects.jpg"), 

output_image_path=os.path.join(execution_path , "new_objects.jpg"), 

extract_detected_objects=True)

Conclusion

At the end of the deep-learning tips, I should add a small sampling of the most useful features of ImageAI, because its capabilities go far beyond the usual object detection:

Setting the minimum probability threshold: the default settings exclude all objects with a probability of up to 50% from sampling. They are not even recorded in the log. If you want, you can change up or down the probabilities for certain cases;
Special detection settings: using the CustomObject class, it is possible to ask the application to pass information about the detection of some unique objects;
Search speed: there is an option to manually reduce the time it takes the application to scan a photo. There are 3 modes of operation: fast, faster, fastest;
Incoming types: you can specify Numpy-array as a picture path, as well as a file stream;
Output types: it's possible to set 

detectObjectsFromImage function to return pictures as a file or Numpy array.

Of course, it is unrealistic to cover all computer vision even in a whole book, but the basic concepts, I hope, I was able to convey.

Bio: Hannah Butler works as a content writer at a company that provides a professional essay writing help for students. She likes sharing her experience in the form of articles in such spheres as Deep Machine Learning, Coding. In her free time, Hannah enjoys rock climbing and bike riding.