Secondary segmentation extracted algorithm based on image enhancement for intelligent identification systems

Abstract

Due to the indefinite position of the characters in the invoice and the difference of the color shades, which greatly increases the difficulty of intelligent identification, it is difficult to meet practical applications. In order to solve this problem, this article proposes a quadratic segmentation algorithm based on image enhancement. Specifically, we first enhance the color of the image based on gamma transformation, and then separate the machine-printing character from the blank invoice based on the color analysis of the machine-printing character. Then according to the open operation in the image processing field and the bounding rectangle algorithm, the pixel information of the machine-printing character is obtained, which is convenient for getting the character information. The algorithm can achieve effective extraction of machine-printing characters and also reduce the difficulty of invoice identification and improving the accuracy of invoice identification. Simulation results are given to confirm the proposed algorithm. After many experiments, the extraction accuracy of this algorithm is as high as 95%.

Keywords

Quadratic segmentation algorithm image processing invoice identification color determination

Introduction

With the rapid development of social economy, the usage of invoices in China is required in financial management. At present, hundreds of millions of invoices have been used for annual reimbursement in China, and the number of invoices has shown an upward trend. However, most of the reimbursement of invoices is done manually. The manual reimbursement of invoices has many disadvantages such as complex reimbursement procedures, long manual processing time, and high processing error rate. In other words, the manual reimbursement of invoices not only aggravates the financial staffs’ workload but also takes up a lot of extra energy of the people who reimburse the invoices. From financial market perfective, the manual reimbursement spends extra labor cost and thus increases the costs of the product or management.¹

In recent years, with the rapid development of image processing and computer vision technology, high-precision, high-efficiency, and low-cost text recognition technology has been realized. Many researchers have introduced the emerging technologies of computer vision into related fields such as invoice identification, and conducted rigorous and profound analysis on the feasibility of these technologies. So it is becoming more and more urgent to find an effective and practical invoice processing method. Value-added tax (VAT) notes are printed by dot matrix printers, and the position of the invoices printed by different printers is indefinite, and the shades of color are different, which are the main reasons that lead to a serious decline in the quality of invoice information extraction. Therefore, it is significant to study image enhancement secondary segmentation in practical applications.²

At present, the colors of the machine-printing characters for the invoice are divided into blue and black. The depth of character color directly affects the effect of image segmentation and recognition. Extracting too much information at one time also affects the recognition. There are still some deficiencies in the existing method for extracting characters from the machine. According to the extraction of the frame, the content to be recognized is too much, which greatly reduces the recognition accuracy.

This article proposes a secondary segmentation extracted algorithm, which can be applied in the actual invoice identification system. The proposed algorithm first performs color enhancement³ on the image. Then, based on the color analysis of the machine characters, the first segmentation is performed to separate the machine-printing characters from the blank invoices. The pixel information of the player is then obtained, and the secondary segmentation is performed so as to realize the extraction of the machine-printing characters. Experiment results are given to validate the proposed algorithm.

Extraction of the machine character

The character extraction method of this article is mainly composed of image enhancement and secondary segmentation. Figure 1 is the scan invoice image. Figure 2 shows the system flow.

Figure 1.

An example of an original invoice image.

Figure 2.

Flow chart of image segmentation.

First, image enhancement is performed on the acquired invoice color image to obtain a clearer image I . A color judgment is made on the password area of the invoice, the color of the machine-printing character is detected, and the first blank invoice and the machine-printing character are segmented to obtain the image I ₁ with only the machine-printing character. The black and white conversion of I ₁ , corrosion, and the acquisition of the rectangle of the outside world are performed. The pixel information of the machine-printing character is obtained, and after the secondary division, the machine extracts the character block.

Image enhancement

The color of invoices produced by different merchants is due to the difference of printers. Hence, the image enhancement is required for invoices to make them clear. We use a method of image enhancement based on gamma transformation.⁴ The enhancement effect is shown in Figure 3.

Figure 3.

An example of the enhanced image.

The gamma conversion is mainly used for image correction and the correction of the image with too high grayscale or low grayscale, which aims to enhance the contrast. The transformation formula does a multiplication of each pixel value on the original image

S = c r^{γ}, r \in [0, 1]

(1)

where $γ$ in the formula is the “convection coefficient” of the image forming image. If the gamma curve is steep, the output contrast is relatively high. If the gamma curve is slow, the output contrast is relatively low.⁵ The correction effect of the gamma transformation on the image is actually achieved by enhancing the details of low gray level or high gray level. We can understand intuitively from the gamma curve of Figure 4.

Figure 4.

The image of gamma curve when c = 1.

The γ value is demarcated by 1. When γ is less than 1, the gradation of the brighter region is compressed, the gradation of the darker region is brighter, and the overall image is brighter. When γ is greater than 1, the gradation of the brighter region is stretched, and the grayscale of the dark region is darker. The compression is darker and the image as a whole is darkened. And the smaller the value is, the stronger the expansion effect is on the low-gradation part of the image. The larger the value is, the stronger the expansion effect on the high-gradation part of the image is. So by changing the gamma value, the effect of enhancing the details of low gray levels or high gray levels can be achieved.

Secondary segmentation

First split

The machine-printing characters on the general invoice are divided into blue and black. Only the color of the machine-printing character of the value-added ticket is not fixed. Therefore, it is necessary to determine the color of the character played in the ticket. The character color that is the color other than white can be judged by using RGB in the password area. According to the judged color of the machine, the corresponding color is divided.

There is a fixed frame line for the VAT ticket, we can first use the mouse to take the function ginput(·),⁶ which can manually cut out the part to be extracted according to the fixed frame of the blank invoice, and generate the Excel file of the location information, such as the buyer, the seller, and the password area. Then import Excel in the code, you can initially split the invoice, reduce the follow-up workload, and improve the extraction rate.

The function ginput provides a cross cursor so that we can more accurately select the position we need and return the coordinate value. The function call form is $[x, y] = ginput (n)$ , which enables you to read n points from the current coordinate system and return the $x, y$ coordinates of these n points, all of which are nX1 vectors.

Importing Excel here to segment the password area is used for color determination. Assume that the color of the machine-printing character is blue, which is shown in Figure 3.

Currently, color digital images can be expressed in a variety of color space models. However, in computer image processing, the RGB model and the HSV model are often used. The RGB model is based on the three primary colors of human vision where the R stands for red, the G stands for green, and the B stands for blue. The appropriate color mixing of red, green, and blue colors can cause any color perception on the electromagnetic spectrum. Since these three color components are highly correlated and form an uneven color space, the perceived difference or color difference between the two colors cannot be expressed as the distance between two points in the color space. So the RGB model is mainly used as a color space model for hardware devices, such as color monitors and color cameras. The HSV model is a color space based on human visual perception characteristics, where chromaticity. which H stands for, represents different colors, such as red, green, and blue; saturation, which S stands for, represents the depth of the color, such as dark green and light green; and brightness, which V stands for, indicates the degree of lightness and darkness of the color, such as very bright and very dark. It has two important characteristics. First, the luminance component is independent of the color information of the image. Second, the chrominance component and the saturation component, the ways which people use to feel the color is closely linked.⁷ Therefore, people often use the HSV model to specify color segmentation.

So the invoice image should be converted from RGB space to HSV space⁸ first. Then, we use the in Range function to segment the blue image region we need by adjusting the H, S, and V regions to obtain a white binary image I ₁ that satisfies the interval condition, which is shown in Figure 5. After testing, H: 100-140, S: 30-255, V: 0-255, we can have the best effect.

Figure 5.

An example of the image after first split.

Second split

Since the image etching operation combines the black areas into blocks, it is necessary to perform a bitwise not transform on I ₁ to achieve black and white reverse color,⁹ which is shown in Figure 6. Then, the etching¹⁰ is performed in a block operation to obtain an image I ₂ which is shown in Figure 7; after that we can respectively open the imported Excel to separate the parts I ₃ and I ₄ except the password area in the images I and I ₂ . I ₃ and I ₄ are respectively shown in Figures 8 and 9. The bounding rectangle¹¹ function is used to calculate the minimum rectangle of the vertical boundary of the contour for I ₄ , where the rectangle is parallel to the upper and lower boundaries of the image, and then the corrosion block boundary information is obtained. For visual convenience, the rectangle boundary is drawn using the rectangle function, as shown in Figure 10. Then split the segmented part a second time.

Figure 6.

An example of the inverted image.

Figure 7.

An example of the eroding image.

Figure 8.

An example of the cut of the enhanced image.

Figure 9.

An example of the cut of the eroding image.

Figure 10.

An example of the image with rectangular block.

We can suppose that the horizontal position of the invoice is X and the longitudinal direction is Y. The pixel information of the top and bottom points of the rectangular block is stored in a two-dimensional matrix in the order of vertex Y, bottom point Y, vertex X, and bottom point Y.

The two-dimensional matrix is sorted in ascending order of vertex Y, and the array of vertex Y is traversed. If the interval of vertex Y is smaller than k, the range of k is between [5,30], and the same row is determined. The rectangular block in the same row is sorted by two-dimensional matrix from small to large according to the vertex X. The row of rectangular blocks is cut out from left to right, and the image G shown in Figure 11 is extracted by analogy.

Figure 11.

An example of the results after segmentation algorithm.

However, there are two cases that require special handling. One is that the rectangular block contains a small rectangular block, which is shown in the second rectangular block in Figure 10. It is necessary to first determine and then filter the small rectangular block. The second is that two lines of text are in a rectangular block and the space between the two lines of text form a rectangular block, which is shown in Figure 12. It is necessary to first determine whether the rectangular block is too large and then split.

Figure 12.

The special case of the image with rectangular block.

Conclusion and future work

This article proposed a quadratic segmentation algorithm based on image enhancement. The algorithm can achieve effective extraction of machine-printing characters. Through the secondary division of the enhanced invoice image, the pixel information of the machine-printing character is finally obtained, and a small block image of the machine-printing character is obtained. The automatic extraction process of invoice information is conducive to the promotion, application, and reality of the invoice intelligent reimbursement system, which has broad application prospects.

This article is only a preliminary study on the extraction of invoice image information. In the future work, we will continue to conduct in-depth studies to achieve better results and more convenient operations. It is also necessary to increase the number of experimental samples and conduct large-scale tests to further increase the effectiveness.

At last, it should be added that the information that the VAT invoice is used for reimbursement is information about the purchaser, the seller, the purchase details, and the total amount.

Footnotes

Acknowledgements

Conceptualization, Heng Dong and Ying Jiang; Methodology, Yu Wang and Guan Gui; Software, Yaping Fan; Validation, Ying Jiang and Guan Gui; Writing-Original Draft Preparation, Ying Jiang; Writing-Review & Editing, Heng Dong and Guan Gui; Supervision, Guan Gui.

Handling Editor: Gianluigi Ferrari

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was funded by the Project Funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions, National Natural Science Foundation of China (61701258), Jiangsu Specially Appointed Professor Grant (RK002STP16001), “Summit of the Six Top Talents” Program of Jiangsu (No. XYDXX-010), Innovation and Entrepreneurship of Jiangsu High-level Talent Grant (CZ0010617002), NUPTSF (No. XK0010915026), and 1311 Talent Plan of Nanjing University of Posts and Telecommunications.

ORCID iD

Guan Gui

References

Zhu

Tang

et al . Tax-control network invoice machine management platform based on socket. In: Proceedings of 2016 2nd IEEE international conference on computer and communications (ICCC), Chengdu, China, 14–17 October 2016, pp.2343–2348. New York: IEEE.

Cai

Tan

Cai

. The VAT tax burden warning model and modification based on CTAIS system data. In: Proceedings of 2011 2nd international conference on artificial intelligence, management science and electronic commerce (AIMSEC), Dengleng, China, 8–10 August 2011, pp.2653–2656. New York: IEEE.

Zhang

. Digital image enhancement system based on MATLAB GUI. In: Proceedings of 2017 8th IEEE international conference on software engineering and service science (ICSESS), Beijing, China, 24–26 November 2017, pp.296–299. New York: IEEE.

Kumari

Thomas

Sahoo

. Single image fog removal using gamma transformation and median filtering. In: Proceedings of 2014 annual IEEE India conference (INDICON), Pune, India, 11–13 December 2014, pp.1–5. New York: IEEE.

Abdenova

. Scaling of input-output data and number of conditionality of a matrix of Grama. In: Proceedings of 2012 IEEE 11th international conference on actual problems of electronics instrument engineering (APEIE), Novosibirsk, 2–4 October 2012, pp.16–18. New York: IEEE.

Song

Antonelli

Fung

TWK

et al . Developing and assessing MATLAB exercises for active concept learning. IEEE T Educ. Epub ahead of print 19 March 2018. DOI: 10.1109/TE.2018.2811406.

Chmelar

Benkrid

. Efficiency of HSV over RGB Gaussian mixture model for fire detection. In: Proceedings of 2014 24th international conference Radioelektronika, Bratislava, 15–16 April 2014, pp.1–4. New York: IEEE.

Indriani

Kusuma

Sari

et al . Tomatoes classification using K-NN based on GLCM and HSV color space. In: Proceedings of 2017 international conference on innovative and creative information technology (ICITech), Salatiga, Indonesia, 2–4 November 2017, pp.1–6. New York: IEEE.

Mercado

Ishii

Ahn

Deep-sea image enhancement using multi-scale retinex with reverse color loss for autonomous underwater vehicles. In: Proceedings of OCEANS 2017—Anchorage, Anchorage, AK, 18–21 September 2017, pp.1–6. New York: IEEE.

10.

Palekar

Parab

Parikh

et al . Real time license plate detection using openCV and tesseract. In: Proceedings of 2017 international conference on communication and signal processing (ICCSP), Chennai, India, 6–8 April 2017, pp.2111–2115. New York: IEEE.

11.

Sharma

Kumar

Yadav

et al . Air-swipe gesture recognition using OpenCV in Android devices. In: Proceedings of 2017 international conference on algorithms, methodology, models and applications in emerging technologies (ICAMMAET), Chennai, India, 16–18 February 2017, pp.1–6. New York: IEEE.