Improved Faster Region-based Convolutional Neural Network in graphic recognition and retrieval

Abstract

To obtain the deeper features of the target image in the complex environment and to improve the robustness and recognition accuracy of the algorithm, first, taking images in Corel5K database as the research object, based on Faster Region-based Convolutional Neural Network (R-CNN) structure, ReLu and Softplus activation functions are used to improve the activation function in the model. The model is used to extract features from images, the region proposal network is used to generate the region proposal of images, and the proposal filtering algorithm based on hierarchical clustering is used to select the target in the region proposal. Then, the spatial pyramid pooling in the original R-CNN structure is replaced with the region of interest (ROI) pooling layer to improve the model, and the classification and identification of targets in the region proposal images are carried out. Based on the gradient descent algorithm, the weights in the model are optimized to improve the recognition accuracy of the model. Finally, the object in the image is detected. Softmax classifier is used to classify the proposal feature images, and category of the object is obtained. After optimization of non-maximum suppression, the regressor is used to update the parameters in the model, and the graphic recognition and retrieval system is constructed. The recognition accuracy of the improved activation function in this study is up to 99.88%, and when the size of the pooling layer is 12*12, the recognition accuracy is up to 98.11%. The significance of this study lies in the successful enhancement of the Faster R-CNN model, resulting in a notable reduction in running time from its previous iteration by 0.37 hours, with the improved model now completing tasks in 0.88 hours. Furthermore, the development of the image retrieval module enables semantic-based image retrieval, allowing for more precise and efficient retrieval of images. Ultimately, these advancements underscore the model’s efficacy in target recognition and its ability to facilitate the recognition and retrieval of graphs, showcasing its potential for various applications in computer vision and image processing tasks.

Keywords

Faster R-CNN region of interest pooling layer Softmax classifier gradient descent algorithm

Get full access to this article

View all access options for this article.

References

Topol

. High-performance medicine: the convergence of human and artificial intelligence. Nat Med 2019; 25(1): 44–56.

Rogers

Aikawa

. Cardiovascular calcification: artificial intelligence and big data accelerate mechanistic discovery. Nat Rev Cardiol 2019; 16(5): 261–274.

Garhwal

Yan

. BIIIA: a bioinformatics-inspired image identification approach. Multimed Tools Appl 2019; 78(8): 9537–9552.

Balamurali

Chandrasekar

. Multiple parameter algorithm approach for adult image identification. Cluster Comput 2019; 22(5): 11909–11917.

Patel

Van De Leemput

Prokop

, et al. Image level training and prediction: intracranial hemorrhage identification in 3D non-contrast CT. IEEE Access 2019; 7: 92355–92364.

Fung

Wendy

. Multigrid optimization for large-scale ptychographic phase retrieval. SIAM J Imaging Sci 2020; 13(1): 214–233.

Bouchakwa

Ayadi

Amous

. Multi-level diversification approach of semantic-based image retrieval results. Prog Artif Intell 2019; 9: 1–30.

Ghrabat

MJJ

Abduljabbar

, et al. Greedy learning of deep Boltzmann machine (GDBM)’s variance and search algorithm for efficient image retrieval. IEEE Access 2019; 7: 169142–169159.

Cui

Liu

Zhang

, et al. An improved deng entropy and its application in image recognition. IEEE Access 2019; 7: 18284–18292.

10.

Zhou

Wang

Luo

, et al. Separability and compactness network for image recognition and superresolution. IEEE Trans Neural Netw Learn Syst 2019; 30(11): 3275–3286.

11.

Cordero-Maldonado

Perathoner

Van Der Kolk

, et al. Deep learning image recognition enables efficient genome editing in zebrafish by automated injections. PLoS One 2019; 14(1): e0202377.

12.

Mehmood

Ullah

Muhammad

, et al. Efficient image recognition and retrieval on IoT-assisted energy-constrained platforms from big data repositories. IEEE Internet Things J 2019; 6(6): 9246–9255.

13.

Satapathy

Mishra

Sundeep

, et al. Deep learning based image recognition for vehicle number information. Int J Innovative Technol Explor Eng 2019; 8: 52–55.

14.

Umer

Dhara

Chanda

. NIR and VW iris image recognition using ensemble of patch statistics features. Vis Comput 2019; 35(9): 1327–1344.

15.

Yang

, et al. Polycentric circle pooling in deep convolutional networks for high-resolution remote sensing image recognition. IEEE J Sel Top Appl Earth Obs Remote Sens 2020; 13: 632–641.

16.

Zhong

Sun

Huo

. An anchor-free region proposal network for Faster R-CNN-based text detection approaches. Int J Doc Anal Recognit 2019; 22(3): 315–327.

17.

Zhou

Liu

, et al. Efficient multiple organ localization in CT image using 3D region proposal network. IEEE Trans Med Imaging 2019; 38(8): 1885–1898.

18.

Huang

Zhou

Yang

, et al. Faster R-CNN for marine organisms detection and recognition using data augmentation. Neurocomputing 2019; 337: 372–384.

19.

Zhou

Zhang

Chen

, et al. Rapid detection of rice disease based on FCM-KM and faster R-CNN fusion. IEEE Access 2019; 7: 143190–143206.

20.

Park

Yoon

Park

. Faster R-CNN and geometric transformation-based detection of driver’s eyes using multiple near-infrared camera sensors. Sensors 2019; 19(1): 197.

21.

Kim

Jung

Park

. Prediction of sound fields after propagation through sound barriers by CNN and DCNN algorithms. J Acoust Soc Am 2019; 146(4): 2803.

22.

Singh

Gehr

Püschel

, et al. An abstract domain for certifying neural networks. Proc ACM Program Lang 2019; 3(POPL): 1–30.

23.

Ciuparu

Nagy-Dăbâcan

Mureşan

. Soft++, a multi-parametric non-saturating non-linearity that improves convergence in deep neural architectures. Neurocomputing 2020; 384: 376–388.

24.

Ghatwary

Zolgharni

. Esophageal abnormality detection using densenet based faster R-CNN with gabor features. IEEE Access 2019; 7: 84374–84385.

25.

Chen

Mai

Xiao

, et al. Improving the antinoise ability of dnns via a bio-inspired noise adaptive activation function rand softplus. Neural Comput 2019; 31(6): 1215–1233.

26.

Wanda

Jie

. RunPool: a dynamic pooling layer for convolution neural network. Int J Comput Intell Syst 2020; 13(1): 66–76.

27.

Zhu

Chen

Zheng

, et al. Automatic recognition of lactating sow postures by refined two-stream RGB-D faster R-CNN. Biosyst Eng 2020; 189: 116–132.

28.

Min

Wang

, et al. New approach to vehicle license plate location based on new model YOLO-L and plate pre-identification. IET Image Process 2019; 13(7): 1041–1049.

29.

Chen

, et al. SSD-MSN: an improved multi-scale object detection network based on SSD. IEEE Access 2019; 7: 80622–80632.

30.

Pan

Huang

Hao

, et al. Towards zero-shot learning generalization via a cosine distance loss. Neurocomputing 2020; 381: 167–176.

31.

Körez

Barışçı

. Object detection with low capacity GPU systems using improved faster R-CNN. Appl Sci 2020; 10(1): 83.