State measurement of isolating switch using cost fusion and smoothness prior based stereo matching

Abstract

To better monitor the state of isolating switches, an efficient binocular vision-based state measurement system is proposed in this article. Two optimal cameras are selected as the vision of our inspection system. Firstly, stereo calibration and distortion rectification are performed on acquired image pair. Secondly, to recover the three-dimensional information of switch, we propose a semi-global stereo matching method by using data- and structure-driven cost volume fusion and then optimizing raw disparity map with weighted- and edge discriminated-smoothness prior. Gradient content is enforced on the weight for suppressing small-weight-accumulation problem in weak-textured regions. Besides, Hough transform with feature constraints is implemented for removing the chaotic lines and extracting center line of the switch arm. Finally, based on the center line and corresponding disparity map of the switch arm, triangulation principle is used for calculating the true angle between the switch arm and insulator such that whether or not the isolating switch is fully closed can be detected. The experimental results demonstrate that the proposed stereo matching method can achieve good performance in Middlebury v.3 data set and switch images, and the system can precisely measure the state of switches.

Keywords

Isolating switch measurement binocular vision semi-global stereo matching Hough transform triangulation principle

Introduction

In a power system, monitoring the state of the substation equipment is a significant task.¹ Isolating switch is the most frequently used device in high-voltage switchgear.^2,3 Although the operating principle and structure of isolating switch are relatively simple, it is relatively prone to failure because of large amount of use. Most of switches operate outdoors, so the equipment components can be inevitably oxidized. The aging of the transmission parts leads to the deviated rotation angle and the incomplete closure of the isolating switch. Whether the switch is completely closed has a great influence on the safety of a substation. Therefore, to ensure the safety and reliability of the substation work, it is especially considerable to correctly identify the state of the isolating switch.

Traditional inspections mainly rely on the manual observation by telescope and subjective judgment. These methods are time-consuming and susceptible to subjective influence. In recent years, real-time inspection technologies for substation have been rapidly developed.⁴ To improve the efficiency of state monitoring for isolating switch, vision-based intelligent inspection technologies^5
–7 have become a new trend, which greatly improves the accuracy and reduces the labor costs. In general, for effectively detecting the switch state, insulators are required to be first located followed by the state determination of the switch.^8

–12 Zhang et al.¹³ proposed a simple but efficient isolating switch state recognition method, which determines the state of the switch by directly performing Hough transform on the switch boundary. Lu et al.¹⁴ employed a method based on image geometry to monitor the substation switch. The angle of the arm is calculated by cosine theorem after the contour of the arm image is obtained, whether the switch state is closed or not is determined eventually. Bin et al.⁶ introduced a switch detection method based on image recognition technology. The approach performs well on the conditions of image rotation and uneven illumination mainly by extracting and matching scale-invariant feature transform (SIFT) features of the isolating switch. In addition, infrared thermography has also gained much attention in electrical preventive maintenance due to its high precision and sensitivity imaging characteristics.¹⁵ Ullah et al.¹⁶ used multilayered perceptron to classify the thermal conditions of components to address initial prevention mechanism for power substations by infrared thermography. Further work based on deep learning approach and infrared images has also been carried out to defect analysis in high-voltage equipment.¹⁷

However, the above approaches are all based on two-dimensional (2D) image content, in which the angle between the switch arm and the insulator cannot uniquely reflect the true angle in 3D space especially under the changes of the camera angle. Moreover, by using an infrared device to monitor the closing of the switch, it is impossible to accurately determine whether the closure is completely successful at the moment the switch is closed. Instead, we employ binocular vision technology to significantly detect the state of the isolating switch according to the 3D depth information in this article.

Binocular stereo vision technology has an extensive range of applications in computer vision, such as robotic autonomous navigation, aerospace and remote sensing, and industrial automation systems.^18
–20 The binocular stereo vision system mainly includes the following parts: binocular image acquisition, camera calibration,^21
–23 image correction,²⁴ stereo matching,^25
–27 and 3D reconstruction^28,29 process. Dense stereo matching is the key issue of stereo vision, and its purpose is to find the corresponding points in image pairs and obtain the disparity map for recovering 3D information. Four steps are primarily included in the stereo matching procedure, which are matching cost calculation, cost aggregation, disparity calculation, and disparity refinement. There are still many challenging issues existing in stereo matching algorithm, such as radiometric distortions, poor performance in weak-textured and disparity discontinuous regions.

In this article, aiming to precisely measure the state of isolating switch in 3D space, we mainly focus on proposing an effective semi-global stereo matching method by two strategies: the first one is constructing the matching cost by fusing data- and structure-driven cost volumes and the second is optimizing the raw disparity map with weighted- and edge discriminated-smoothness constraint. In particular, the weight is combined with gradient content such that the small-weight-accumulation problem in weak-textured regions can be suppressed to a certain extent. Based on our proposed semi-global stereo matching method, we also present a complete binocular vision system that can be employed to calculate the true angle between the switch arm and insulator, which consists of stereo calibration and rectification, switch detection, center line extraction of switch arm, stereo matching, and depth calculation. Firstly, stereo calibration is performed to derive internal and external parameters of cameras. After image rectification, the complete switch inspection is conducted followed by proposed feature constrained-Hough transform³⁰ for removing the chaotic lines and extracting center line of the switch arm. Moreover, the proposed stereo matching method is implemented from which only the stable disparities are utilized in later 3D coordinate mapping for guaranteeing the accuracy of angle measurement. Finally, the true angle between the switch arm and insulator is obtained by the principle of triangulation and whether or not the isolating switch is fully closed can be identified. The experimental results illustrated that the proposed method not only obtains higher measurement accuracy but also contributes to the substation safety.

The following section outlines the related work of stereo matching. Following that, the methodologies of the proposed methods are presented in the third section. In the fourth section, the performance of the proposed method is tested on Middlebury v.3 data set as well as the switch images. The fifth section concludes the article.

Related work

A detailed classification of stereo matching algorithms is provided by Scharstein and Szeliski,³¹ in which stereo matching algorithms can be divided into local and global methods. Generally, global methods³² can be formulated as an energy minimization problem. The energy function composes of a data term and smoothness term, which is to explicitly penalize disparity discontinuity between neighbor pixels. Dynamic programming (DP),³³ graph cut,³⁴ and belief propagation³⁵ are famous global optimizers, which can obtain comprehensive disparity characterization at the expense of substantial computational complexity.

Different from global methods, local ones are performed by calculating the matching cost between patches in image pairs, followed by aggregation of computed cost volumes over local support windows. Patches are compared by dissimilarity measures, which can be divided into two groups. The first group is traditional dissimilarity measures including sum of absolute differences, normalized cross-correlation, rank and census transform. The second group is learning-based measures, such as discriminative dictionary learning (DDL)³⁶ and convolutional neural networks.³⁷ Due to better generalization ability for stereo, in this work, cost computation is based on DDL.³² However, the DDL tends not to preserve fine details due to the loss of constraint by structural information such as gradients. Hence, in this article, we present a data- and structure-driven cost volume fusion strategy to further improve the robustness of cost volume such that the shortcoming of the single data-driven cost computation can be neutralized.

Besides, cost aggregation is another vital step in local methods. In despite of variable support window (VSM)-based or adaptive support weight (ASW)-based aggregation method, the primary aim is to ensure the implicit smoothness assumption that pixels within the same support window share the similar depth. In VSM-based methods, multiple, fixed, and adaptive windows are employed to prevent them from crossing depth discontinuous regions.^38
–40 However, a rectangular window is not always rigorous enough at discontinuity regions and the computation within adaptive windows can be time-consuming. ASW-based methods assign preferential weights for each pixel within predefined window. Normally, color dissimilarity and distance proximity are chosen for determining the weight. The disadvantage of ASW-based methods is that the matching ambiguity will be unavoidable in low-textured regions, where small-weight-accumulation problem exists.

In the work by Hirschmuller,⁴¹ semi-global matching (SGM) stereo method is proposed, which works in three steps: first, pixel-wise cost computation is computed; then, the cost volume is aggregated along 8/16 1D search passes with local smoothness constraint between neighbor pixels. Finally, post-processing is employed to remove outliers and obtain only consistent estimation. However, SGM is sensitive to penalty parameter as the smoothness terms need to be big enough to process smooth disparity map and small enough to maintain fine structures at discontinuities. In addition, SGM may result in ambiguous discontinuities around foreground objects in textureless background. Numerous variants of SGM have also been developed to obtain significant performance gains. Mei et al.⁴² performed cost aggregation in dynamic cross-based window and optimized the disparity map in a scanline optimization framework. Liu et al.⁴³ proposed a stereo framework and formulated a joint second-order smoothness prior, avoiding the blend of the foreground and background and retaining the disparity discontinuities of strong texture regions. Liu et al.⁴⁴ presented a 3D aggregation strategy based on adaptive support window and good performance in slanted regions can be reached. Chuang et al.⁴⁵ incorporated penalty tuning method and edge content in gradual SGM cost aggregation, balancing the smoothness and fine structures of disparity result. Yao et al.⁴⁶ extended the SGM to a more global matching for overcoming the limitation of local scanline optimization.

However, in these SGM-based methods mentioned above, the selection of optimal disparities from full candidates introduces uncertainty in aggregation passes and the final depth recovery ability. For better handling slanted plans, we propose weighted- and edge discriminated-smoothness constraint to efficiently optimize the raw disparity result. The penalties at edge regions are automatically determined based on confidence measure for penalizing larger disparity changes. The weight combining appearance and gradient content can suppress the small-weight-accumulation problem to a certain extent. In this article, we dedicate to apply binocular stereo vision system to state measurement and improve the matching performance in weak-textured and disparity discontinuous regions. The contributions of this article are as follows: (1) feature constraints are enforced on Hough transform for obtaining the center line of switch arm; (2) data- and structure-driven cost volumes are fused to construct the matching cost, improving the matching accuracy; and (3) in cost aggregation process, weighted- and edge discriminated-smoothness constraint is proposed to optimize the raw disparity map. The smoothness constraint is determined automatically according to the image confidence.

Methodology

Overview of procedure

An overview of the proposed method is illustrated in Figure 1. Our system can be divided into three parts: camera calibration and rectification, stereo matching, and center line extraction of switch arm. To start, the first part mainly includes stereo calibration and distortion rectification for camera parameters and subsequently horizontal dissimilarity searching. The second part includes semi-global stereo matching based on data- and structure-driven cost volume fusion and weighted- and edge discriminated-smoothness prior. The third part is for center line extraction of switch arm and we mainly focus on exploiting feature constraints-based Hough transform to remove chaotic lines and extract center line of the switch arm. Ultimately, the vector of the center line in 3D space can be obtained by combining the disparity map and the positions of center line. Consequently, we can get the angle between the arm and the insulator so as to determine whether or not the isolating switch is fully closed.

Figure 1.

Flowchart of the proposed measurement system. Our system can be divided into three parts in detail: camera calibration and rectification, stereo matching, and center line extraction of switch arm.

Matching cost computation

Matching cost measurement is used to compute initial cost volume, which is the basis of the whole stereo matching algorithm. Cost volume is a 3D matrix with each element corresponding to the cost of a relative pixel at a certain disparity level. Considering better generalization ability for stereo, we adopt the DDL³⁶ method, which belongs to data-driven approaches and utilizes sparse representation to represent each patch over learned discriminative dictionary, to measure the similarity between image patches. The matching cost value $C_{DDL} (p, d)$ obtained from DDL method for pixel p at disparity value d is denoted as

C_{DDL} (p, d) = - s (P_{l} (x, y), P_{r} (x - d, y))

where $s (P_{l} (x, y), P_{r} (x - d, y))$ indicates the similarity score between input patches $P_{l} (x, y)$ and $P_{r} (x - d, y)$ , the negative sign means the similarity score is converted to the matching cost.

Nonetheless, the DDL method tends not to preserve fine details due to loss of constraint by structural information such as gradients. Therefore, in this article, we present a data- and structure-driven cost volume fusion strategy to further improve the robustness of cost volume such that the shortcoming of the single data-driven cost can be neutralized. The fused matching cost measure is defined as follows

C (p, d) = λ_{1} min (C_{GRAD} (p, d), τ_{GRAD}) + (1 - λ_{1}) C_{DDL} (p, d)

where τ_GRAD is the truncating value, coefficient λ₁ balances the impact of each measure. $C_{GRAD} (p, d)$ is the gradient-based cost measurement, which is exploited as follows

C_{GRAD} (p, d) = | \sqrt{{(\nabla_{x} G_{L} (p))}^{2} + {(\nabla_{y} G_{L} (p))}^{2}} - \sqrt{{(\nabla_{x} G_{R} (p_{d}))}^{2} + {(\nabla_{y} G_{R} (p_{d}))}^{2}} |

where ∇_x and ∇_y are the derivative operators in x and y directions, respectively. Grayscale image, $G (\cdot)$ , is obtained by averaging the RGB channels of input color image. p_d is the corresponding pixel in the right image of pixel $p (x, y)$ in the left image and satisfy $p_{d} = p (x - d, y)$ . In our implementation, we set the empirical values $τ_{GRAD} = 2, λ_{1} = 0.3$ .

Weighted and edge discriminated cost aggregation

Edge weighted tree structure

Global method regards the matching task as the optimization of energy function, which is defined by

E (D) = \sum_{p \in I} m (p, d_{p}) + \sum_{(p, q) \in N} s (d_{p}, d_{q})

s (d_{p}, d_{q}) = \{\begin{cases} 0, & | d_{p} - d_{q} | = 0 \\ P_{1}, & 0 < | d_{p} - d_{q} | \leq d_{step} \\ P_{2}, & otherwise \end{cases}

where D represents disparity map, p belongs to image I and is assigned to disparity d_p . The first term is the data term, while the latter one is regularization function enforced on neighboring pixels p and q in accordance with the predefined set N. P ₁ is used to penalize small jumps in disparity. For disparity jumps greater than one pixel where disparity borders are more likely to occur, a more critical parameter P ₂ is generally adopted with $P_{2} > P_{1}$ .

For simplicity and computational efficiency, edge weighted horizontal tree structure is adopted for enforcing the connectivity of adjacent pixels such that the energy optimization can be efficiently implemented via DP. The horizontal tree is constructed as shown in Figure 2. The tree is rooted on pixel p, and the horizontal tree is established by extending the node p in the vertical direction, then a batch of horizontal expanding is performed. A complete cost aggregation process combines forward and backward passes in the horizontal phase.

Figure 2.

Edge weighted tree structure: (a) horizontal tree structure, (b) the horizontal phase, and (c) the vertical phase. p is the root node. Different edges correspond to different weights. The whole forward–backward aggregation is illustrated in the dashed rectangle.

In the edge weighted tree structure, the edge connecting adjacent nodes are generally assigned weight determined by color difference.⁴⁷ However, in weak-textured regions, the color difference would be quite small. Many unreliable high weights accumulate along a long path, resulting in small-weight-accumulation drawback.⁴⁸ In this article, we propose an enhanced weight function based on color and gradient content. The edge image for weight calculation is obtained by random forest method.⁴⁹ The proposed edge weight is defined as follows

W_{p, q} (I) = \{\begin{cases} exp \frac{- (| | I_{p} - I_{q} | | + max (\frac{β | E_{p} - E_{q} |}{E_{m}}, T_{w}))}{σ}, & | | I_{p} - I_{q} | | \leq T_{w} \\ exp (\frac{- | | I_{p} - I_{q} | |}{σ}), & else \end{cases}

where $| | I_{p} - I_{q} | |$ is the maximum color difference computed separately under the RGB three channels between adjacent pixels p and q. I and E denote the reference image and edge image, respectively. E_m is the maximum pixel value in the edge image. σ is a user-defined parameter that adjusts the smoothness degree and β is used to normalize the interval of pixel values. T_w is the truncating value and the empirical parameters are set in our implementation as follows: $σ = 0.1$ , $β = 20$ , and $T_{w} = 2$ . When color difference between adjacent pixels is less than T_w , the weight only considering the color content is close to 1 and should be reduced to alleviate the problem of small-weight-accumulation. The degree of reduction of the weight is determined according to the gradient difference, and the larger the difference, the lower the weight. By combining the two kinds of information, the weight can be adaptively adjusted such that the matching accuracy will be increased in weak-textured and disparity discontinuous regions.

Confidence-based penalty tuning

How to automatically assign appropriate penalty variables ( $P_{1}, P_{2}$ ) is vital to the cost aggregation performance. A small penalty causes sharp boundaries, but many outliers would appear. On the other hand, a large penalty would reduce mismatching in smooth regions but results in over-smooth object boundaries, losing details of the object. To enhance the robustness of the parameters, an adaptive penalty tuning strategy is employed in the proposed method.

Normally, the initial disparity maps are obtained by WTA strategy; subsequently, the left–right consistency check is performed for picking confidence points from the initial disparity maps. A confident ratio $C_{r p}$ of pixel p is defined by

C_{r p} = m_{s p} / m_{f p}, p \in n_{c p}

where $m_{f p}$ and $m_{s p}$ represent the smallest and secondary small cost values, respectively. $n_{c p}$ is the number of confident points. The greater the confident ratio, the more likely the pixel is matched correctly. Afterward, the penalty P ₁ is obtained by taking the average of the summation between cost values $m_{f p}$ and $m_{s p}$

P_{1} = \frac{1}{n_{c p}} \sum_{p \in n_{c p}} (m_{s p} + m_{f p})

P_{2} = (P_{1} \cdot | | I_{p} - I_{q} | | + P_{1}) / 2

where $| | I_{p} - I_{q} | |$ indicates the absolute intensity difference. P ₂ is a larger constant accounting for penalizing all larger disparity changes that occur at disparity boundaries.

Edge discriminated cost aggregation

After constructing the horizontal tree, cost volume is first aggregated from leaves to the root node which locates at same column in horizontal passes. Then, the cost volume is aggregated from the subsidiary nodes to the root node along vertical passes. Therefore, the energy function in equation (4) can be optimized as follows

E (D) = \sum_{p \in I} m (p, d_{p}) + W_{p, q} (I) (\begin{array}{l} \sum_{q \in v (p)} P_{1} |_{s (d_{p}, d_{q}), 0 < d_{p} - d_{q} \leq d_{s t e p})} + \\ \sum_{q \in v (p)} P_{2} |_{s (d_{p}, d_{q}), (d_{p} - d_{q} > d_{s t e p}) \cap (p \in E_{p})} \end{array})

where P ₁ is a lower penalty for adapting slanted or curved planes and only penalizing the small step change $d_{step} = 1$ . For depth discontinuities located by image edge, a more critical penalty P ₂ is used to penalize all other larger disparity jumps. Specifically, when the disparity jumps smaller than $d_{step}$ , small penalty is assigned. And the aggregated cost value comes from the minimum of the adjacent three disparity layers, which include the upper, current, and lower disparity layers. When pixel p locates at image edges where depth discontinuities usually exist, the optimum disparity value is more likely to be generated from layers except from the three disparity layers. Thus, we choose to enforce different penalties on different layers to build appropriate regularization term such that both the slanted planes and depth discontinuities can be handled well.

The cost aggregation process is shown in Figure 2, which is divided into two phases: the horizontal phase and the vertical phase. In each phase, the forward and backward propagation are employed. For the forward pass, the cost volume is renewed recursively from the leftmost pixel of a scanline. The accumulated cost values are preserved in array C_F . Similarly, the backward pass calculates the costs successively from the rightmost pixel and the accumulated cost values are stored in C_B . Finally, the aggregated cost value $C_{F i n} (p, d_{p})$ for pixel p at disparity d_p is calculated by summing up the two reverse passes followed by subtracting the current cost

C_{F i n} (p, d_{p}) = C_{F} (p, d_{p}) + C_{B} (p, d_{p}) - m (p, d_{p})

Disparity computation and refinement

After the cost volumes are regularized, the raw disparity map is determined by selecting the minimum aggregated disparity value for each pixel utilizing WTA strategy

D (p) = \underset{d \in disp}{arg min} C_{F i n} (p, d_{p})

where disp is the maximum disparity candidate. The disparity map obtained by this stage still has outliers in occluded regions. Therefore, weighted median filter is adopted to correct unstable pixels to a great extent.

Center line extraction of switch arm

The rough location of the isolating switch employs the SVM algorithm⁵⁰ combined with the Hog eigenvector of the switch, and this rough location is not the focus of this article. Instead, suppose that the rough location is finished and then we focus on exploiting feature constraint-based Hough transform for obtaining the center line of switch arm. Hough transform is an approach for detecting lines, circles, and other shapes. The main principle is that the image is mapped from the original polar coordinate space into parametric space and then to fit the shapes such as lines and circles. A line in the polar coordinate space corresponds to a point in the Hough space, and a line in the Hough space corresponds to a point in the polar coordinate space. Firstly, the contours of switch image from rough location are detected by random forest method.⁴⁹ After that, Hough transform algorithm is adopted to search the straight lines of arm edges on the contour image. However, there are chaotic lines in the background of image and small segments also exist, resulting in failed center line extraction. Therefore, a novel method leveraging feature constraints-based Hough transform is proposed for removing the chaotic lines and extracting center line of the switch arm. The specific steps are as follows:

Step 1: As isolating switch has been identified by SVM method, the switch arm takes up the close-up view of the identified isolating switch. Assume the end-to-end coordinates of lines detected by Hough Transform is stored in array $A_{N}$

A_{N} = \{(x_{1}, y_{1}), ({x^{'}}_{1}, {y^{'}}_{1})\}, \{(x_{2}, y_{2}), ({x^{'}}_{2}, {y^{'}}_{2})\}, \dots, \{(x_{m}, y_{m}), ({x^{'}}_{m}, {y^{'}}_{m})\}, \dots, \{(x_{N}, y_{N}), ({x^{'}}_{N}, {y^{'}}_{N})\}, 1 \leq m \leq N

where N is the total number of lines, $(x, y)$ and $(x ′, y ′)$ denote the starting and ending coordinate of each line, respectively.

The slope $k_{N} = (y_{N}^{'} - y_{N}) / (x_{N}^{'} - x_{N})$ , angle $α_{N} = arctan (k_{N})$ , and the length $l_{N} = \sqrt{{(x_{N}^{'} - x_{N})}^{2} + {(y_{N}^{'} - y_{N})}^{2}}$ of each candidate line are obtained according to the end-to-end coordinates.

Step 2: A line detected by the Hough transform could be divided into several segments, it is necessary to merge the segments by means of integration strategy. If the minimum pixel distance between any two parallel lines belonging to one long line is smaller than integration threshold T_b , these two parallel lines are merged as one line. To eliminate inverse influence of vertical lines in the background, we delete the lines of which the angle is larger than T_v degree since the purpose of this article is to detect whether or not the isolating switch is fully closed. $T_{b} =wid / 2, T_{v} = 80$ , wid is the width of the image.

Step 3: To eliminate bad influence of small and cluttered lines, the threshold of the candidate lines is set to $T_{Ω}$ , that is, the lines are composed of at least $T_{Ω}$ pixels. We select the first m′ lines in terms of high probabilities such that the two edges of switch arms can be screened out accurately. Then, the corresponding information of the selected lines is stored in array $P_{o} _{m}$ . Values of $T_{Ω} = 50, m' = 5$ are assumed here

P_{o} ​_{m} = \{\{(x_{1}, y_{1}), ({x^{'}}_{1}, {y^{'}}_{1}), α_{1}, l_{1}\}, \{(x_{2}, y_{2}), ({x^{'}}_{2}, {y^{'}}_{2}), α_{2}, l_{2}\}, \dots, \{(x_{m^{'}}, y_{m^{'}}), ({x^{'}}_{m^{'}}, {y^{'}}_{m^{'}}), α_{m^{'}}, l_{m^{'}}\}\}

Step 4: Based on the characteristics of isolating switch arm whose two edges are parallel, parallel constraint is enforced, namely, the angle difference between two lines is calculated. The angle threshold is set to w = 2° and the parallel constraint is considered to be satisfied when the angle difference is less than the threshold. If the parallel constraint is satisfied, two lines with the smallest angle difference are selected as the detected edges of switch arm. Otherwise, the longest line is selected as the detected edge.

Step 5: After accurately detecting the two edges of arm, the disparity map is required to obtain the angle between the switch arm and insulator. However, the disparity value is unstable at the arm edges. If the result of edge detection is slightly shifted or the disparity values of the edge are not accurately calculated, the 3D information of the lines will be wrong. Consequently, it is indispensable to extract the center line which locates at disparity continuous regions. The end-to-end coordinates of the center line of the switch arm are as follows

\{\begin{cases} x_{c} = \frac{x_{i 1} + x_{i 2}}{2}, y_{c} = \frac{y_{i 1} + y_{i 2}}{2} \\ x_{c}^{'} = \frac{x_{i 1}^{'} + x_{i 2}^{'}}{2}, y_{c}^{'} = \frac{y_{i 1}^{'} + y_{i 2}^{'}}{2} \end{cases}

where $(x_{c}, y_{c})$ and $(x_{c}^{'}, y_{c}^{'})$ denote the end-to-end coordinates of middle line, respectively. $\{(x_{i 1}, y_{i 1}), (x_{i 1}^{'}, y_{i 1}^{'})\}$ and $\{(x_{i 2}, y_{i 2}), (x_{i 2}^{'}, y_{i 2}^{'})\}$ represent two detected edges of switch arm. Note that if only the longest line is detected, the center line extraction is not necessary.

Angle measurement of switch arm in 3D space

Stereo imaging system includes binocular image acquisition, camera calibration, image rectification, and 3D information recovery. With reference to Figure 3, the imaging planes of the binocular cameras are coplanar and the distance from cameras to optical centers (O ₁ and O ₂) is f. The objective point $P (u^{c}, v^{c}, z^{c})$ in the 3D space is mapped to the points $P_{l} (u_{l}, v_{l})$ and $P_{r} (u_{r}, v_{r})$ in the left and right view, respectively. The baseline distance between the two cameras is B.

Figure 3.

Projection of 3D objective point $P (u^{c}, v^{c}, z^{c})$ onto binocular cameras. O ₁ and O ₂ are the center coordinates of the left and right image, B is the baseline distance, and f is the focal length. The objective point $P (u^{c}, v^{c}, z^{c})$ is projected to the left and right image points as P_l , P_r .

After acquiring the images, the intrinsic, extrinsic, and distortion parameters of the cameras are obtained based on the calibration method proposed by Zhang.⁵¹ To reduce the complexity of matching problem, the epipolar constraint⁵² provides an efficient strategy that corresponding points only have horizontal offsets. Once the image pairs have been rectified, the points P_l and P_r satisfy that $u_{r} = u_{l} - d, v_{r} = v_{l}$ , where d is the disparity value between the two points. Assuming that the left and right images are on the same plane, the following formula can be obtained from the triangular geometry

\{\begin{cases} u_{l} = f \frac{u^{c}}{z^{c}}, v_{l} = f \frac{v^{c}}{z^{c}} \\ u_{r} = f \frac{(u^{c} - B)}{z^{c}}, v_{r} = f \frac{v^{c}}{z^{c}} \end{cases}

where $(u^{c}, v^{c}, z^{c})$ is the 3D coordinate of P, B is the baseline distance, and f denotes the focal length of two cameras. The disparity $d = u_{1} - u_{2} = f B / z^{c}$ is defined as the horizontal position difference of corresponding points in two images.

Therefore, the 3D coordinate of point P is recovered as follows

\{\begin{cases} u^{c} = \frac{B \cdot u_{l}}{d} \\ v^{c} = \frac{B \cdot v_{l}}{d} \\ z^{c} = \frac{B \cdot f}{d} \end{cases}

For isolating switches, when they are in condition “closed,” the measured angle between the switch arm and insulator will be in a certain range. If not, it may indicate that the switch is not fully closed well. The position of the middle line and the corresponding depth information are combined, then the end-to-end coordinates of the middle line can be mapped into 3D space, that is, $(X_{3 D c}, Y_{3 D c}, Z_{3 D c}), (X_{3 D c}^{'}, Y_{3 D c}^{'}, Z_{3 D c}^{'})$ . Take the horizontal vector $b$ as the reference and the vector satisfies $| b | = 1$ . The angle is calculated between the center line and the horizontal vector

θ = arccos \frac{a \cdot b}{| a | | b |}

where $a = (X_{3 D c} - X_{3 D c}^{'}, Y_{3 D c} - Y_{3 D c}^{'}, Z_{3 D c} - Z_{3 D c}^{'})$ denotes the vector of the center line of switch arm. $arccos (\cdot)$ is the anti-cosine function. θ is the angle between the horizontal direction and switch arm.

Experimental results

In this section, the performance of the proposed stereo matching method is evaluated on the images of the Middlebury stereo benchmark⁵³ and isolating switches, respectively. To further compare with the proposed method, seven state-of-the-art methods are chosen. In addition, the center line extraction of switch arm leverages the Hough transform and prior constraints, and the evaluation of this content is also illustrated.

Evaluation on Middlebury v.3 data set

For evaluation of the proposed stereo matching algorithm, other seven algorithms are selected to compare on the Middlebury v.3 data set which is challenging and representative. The seven methods are weighted cost propagation with smoothness prior (WCPSP),⁵⁴ iterative color-depth (MSTCD2),⁵⁵ DDL,³⁶ iterative guided filter (IGF),⁵⁶ improvement of stereo matching (ISM),⁵⁷ pyramid stereo matching network (PSMNet),⁵⁸ and second-order semi-global matching (SGM-SO),⁵⁹ respectively. In particular, WCPSP method enforces local smoothness during the weighted cost aggregation on horizontal tree structure, and the smoothness constraints contribute to smooth disparity map, especially on curved surfaces. SGM-SO method utilizes angle direction priors-based second-order smoothness constraint to improve the matching accuracy in weak-textured regions. Then, slanted plane iterative optimization is performed to optimize disparity maps.

The detailed evaluation results of 15 Middlebury v.3 image pairs in non-occluded regions are shown in Table 1. The average error rate of each method is calculated. Our method holds the best performance in accuracy. The average error rate of our method is the lowest among evaluated methods. PSMNet does not work well in the data set, and the average mismatching rate reaches 19.16%, which is twice of the proposed method. Compared to WCPSP, DDL is significantly better due to dictionary learning-based matching cost. However, by implementing data- and structure-driven cost volume fusion and weighted- and edge discriminated-smoothness prior, our method achieves more accurate disparity maps and reliable depth boundaries in comparison with DDL.

Table 1.

Mismatching rate (%) on 15 Middlebury v.3 stereo pairs by eight algorithms.^a

Stereo pairs	WCPSP	MSTCD2	DDL	IGF	ISM	PSMNet	SGM-SO	Proposed
Adirondack	8.06	11.42	2.44	6.62	6.88	11.03	3.80	2.54
ArtL	10.24	9.99	9.22	9.56	10.05	23.39	12.18	8.14
Jadeplant	25.12	16.43	12.48	16.21	16.91	37.25	16.91	11.54
Motorcycle	4.84	3.87	4.88	4.82	4.96	8.00	6.06	4.83
MotorcycleE	6.92	7.58	4.24	4.51	4.74	10.14	5.80	4.19
Piano	17.86	19.29	9.92	14.68	14.47	17.54	11.18	8.69
PianoL	22.18	27.16	18.60	21.09	21.35	29.46	19.93	15.59
Pipes	12.33	9.61	8.50	10.25	10.53	25.56	11.58	9.17
Playroom	13.54	16.12	9.18	11.42	11.24	18.29	16.44	9.73
Playtable	31.13	33.54	11.57	28.63	28.68	10.50	9.06	11.24
PlaytableP	5.33	10.64	6.37	11.70	11.71	10.16	8.10	5.85
Recycle	3.42	6.44	4.17	5.85	6.20	6.95	4.10	3.68
Shelves	26.43	28.83	25.38	27.28	25.42	40.38	33.00	24.80
Teddy	2.53	5.06	3.82	3.86	3.91	8.24	5.68	3.25
Vintage	24.95	36.61	17.68	21.61	21.02	30.42	20.46	16.25
Average error	14.33	16.17	9.90	13.21	13.20	19.16	12.28	9.30

^a Error threshold is 2.0 pixel and the non-occluded regions are evaluated.

WCPSP: weighted cost propagation with smoothness prior; DDL: discriminative dictionary learning; IGF: iterative guided filter; SGM-SO: second-order semi-global matching; PSMNet: pyramid stereo matching network; MSTCD2: iterative color-depth.

For qualitative analysis, Figure 4 presents the disparity maps of Adirondack, ArtL, MotorcycleE, Piano, Pipes, and Playroom. It can be clearly seen that the disparity maps of the proposed method are close to the ground truth. As for MotorcycleE, the disparity map calculated by our method is the best especially on the edge of the motorcycle. Even if the performance of IGF and ISM method is comparable, these methods are not very effective in weak-textured area due to the limitation of local methods. For Piano, fine and smooth results are obtained on piano and stool by WCPSP and the proposed method because of the smoothness constraint. However, WCPSP has a large number of mismatched pixels at top right corner. It may be because the matching cost function is not very competitive for illumination. It can be concluded that the proposed method keeps the distinguished results among these methods especially in the weak-textured and the boundary regions.

Figure 4.

Disparity maps of the Middlebury data sets: (a) left image, (b) ground truth, (c) WCPSP, (d) MSTCD2, (e) DDL, (f) IGF, (g) ISM, (h) PSMNet, (i) SGM-SO, and (j) the proposed method. The data below the image are the mismatching rate (%). From the first row to end, the image pairs are ArtL, Jadeplant, MotorcycleE, Piano, and Shelves, respectively. WCPSP: weighted cost propagation with smoothness prior; DDL: discriminative dictionary learning; IGF: iterative guided filter; SGM-SO: second-order semi-global matching; PSMNet: pyramid stereo matching network; MSTCD2: iterative color-depth.

Center line extraction of switch arm

One pivotal operation in calculating switch angle is the extraction of the center line of switch arm. Since the disparity values of arm edges which locate at depth discontinuities are unstable, it’s indispensable to acquire disparity values that are more stable in the center line of the arm.

In the proposed method, random forest-based edge detection and Hough transform combined with feature constraints are adopted. Figure 5 presents the results of line detection with different configurations. Images with low light, blurred details, or complex backgrounds are selected for line detection. To prove the effectiveness of the proposed method, four different experimental settings for detecting the edge of the switch arm are adopted. It can be seen from the first row of Figure 5(a) and (c) that when constraints are not enforced, the random forest method could obtain more accurate edges than Canny operator, which is indicated at the lower edge of the switch arm. Then, the center line in Figure 5(d) is obviously more stable than Figure 5(b). For the second and third rows, there are more complex scenarios in the background where chaotic lines occur. In this case, when the constraints are added, the edges of switch arm can be successfully detected, which reveals the positive effect of the proposed feature constraints in combing random forest-based edge detection.

Figure 5.

Results of center line detection with four kinds of experimental settings: (a) Canny, no feature constraint; (b) Canny, feature constraint; (c) random forest, no feature constraint; and (d) random forest, feature constraint. The red lines are the detected lines without constraints, while green lines are detected with constraints. If parallel lines do not exist, the longest line is chosen. The yellow line is the center line extracted by the proposed algorithm.

Switch images

Stereo calibration

To recover the 3D information of the object, we adopt Zhang’s calibration method⁵¹ for obtaining the camera intrinsic and extrinsic parameters. The calibration accuracy in binocular stereo vision is a crucial factor and will affect the accuracy of subsequent 3D reconstruction. The intrinsic and extrinsic parameters of the binocular camera calibration are presented in Tables 2 and 3, respectively. f_x and f_y are horizontal and vertical focal lengths, $(u_{0}, v_{0})$ is the center coordinates in the image coordinate system, k ₁ and k ₂ are radial distortion parameters, and p ₁, p ₂ are tangential distortion parameters.

Table 2.

Intrinsic parameters of camera.

Parameters	f_x	f_y	u ₀	v ₀	k ₁	k ₂	p ₁	p ₂
Left camera	1034.38	1035.31	393.30	294.97	$- 5.69 \times 10^{- 1}$	3.73	$- 5.52 \times 10^{- 3}$	$- 2.83 \times 10^{- 3}$
Right camera	1036.16	1036.16	427.45	276.88	$- 5.06 \times 10^{- 1}$	0.71	$- 3.18 \times 10^{- 3}$	$5.21 \times 10^{- 5}$

Table 3.

Extrinsic parameters of camera.

Parameters	R	T
	$[\begin{matrix} 0.9990 & - 0.0160 & - 0.0457 \\ 0.0165 & 1.0000 & 0.0109 \\ 0.0455 & - 0.0117 & 0.9990 \end{matrix}]$	$[\begin{matrix} - 94.66 \\ - 0.262 \\ - 0.661 \end{matrix}]$

In the extrinsic parameters, R and T are the pose relationship between binocular cameras, R is the rotation matrix and T is the translation vector. In general, when the cameras are placed horizontally, R is an identity matrix, which means no rotation and scaling. The first parameter of the T matrix is the base distance $B = 9.466 cm$ in this article.

Measurement analysis

As not all the codes of the compared methods such as IGF and ISM methods are available, the quantitative results in “Evaluation on Middlebury v.3 data set” section of these compared methods are obtained from Middlebury homepage directly. In this section, we utilize four stereo matching methods (i.e. WCPSP, MSTCD2, DDL, and SGM-SO) to analyze the performance of measurement in comparison with our method. These techniques can be used to detect the angle of isolating switch in the power system. Generally, when the angle is 0° or within one preset threshold, the switch arm is regarded as fully closed and operating smoothly. As the accuracy of the angle detection is chiefly affected by angle and distance, the stereo images are captured under different distances and angles using binocular industrial cameras.

Figure 6 presents the captured image pairs under three angles (0° (fully closed), 5°, and 11°) and three distances (1 m, 1.5 m, and 2 m). WCPSP, MSTCD2, DDL, SGM-SO, and the proposed method are employed to generate the disparity map for each stereo pair. In Figure 7, we show the disparity maps of these five stereo matching methods under 1.5 m distance with three different angles (0°, 5°, and 11°). The matching cost function of WCPSP only relies on the linear accumulation of traditional truncated absolute difference (AD) and gradient content, which is sensitive to pattern regions and image noises and is inferior to learning-based dissimilarity measure. Visually, it can be seen from Figure 7(a) that there are obvious error matching points in the leftmost and background regions. In nonlocal MSTCD2 method, minimum spanning tree structure is built for each pixel and competitive performance against guided image filter on both efficiency and accuracy has been illustrated in this method. However, as the minimum spanning tree structure would become no longer unique in low-textured regions, error disparity values will be obtained. This is consistent with the appearance of black holes in Figure 7(b). In addition, it is hard to guarantee excellent performance in slanted surfaces because cost aggregation is performed on independent cost slices. Instead, our method performs cost aggregation based on edge discrimination strategy and cost slices are incorporated adaptively such that the slanted or curved surfaces can be handled properly. SGM-SO method has a “false obesity” phenomenon between the insulators, which is mainly because of excessive slanted plane iterative optimization. Moreover, compared with DDL, our method has less scanline effects and better distinction on objects edges, especially the edges of the switch arm. It can be concluded that even for these most challenging data, our method remains keeping the best performance in disparity accuracy.

Figure 6.

Binocular images of isolating switch under different angles and distances. Columns from left to right: low (0°), medium (5°), and high (11°) angle. Rows from top to bottom: short (1 m), medium (1.5 m), and long (2 m) distance. (a)-(c) are images captured under 1m distance with three different angles. (d)-(f) are images captured under 1.5m distance with three different angles. (g)-(i) are images captured under 2m distance with three different angles.

Figure 7.

Disparity maps of five methods at 1.5 m distance: (a) WCPSP, (b) MSTCD2, (c) DDL, (d) SGM-SO, and (e) the proposed method. Rows from top to bottom represent 3°, respectively. WCPSP: weighted cost propagation with smoothness prior; DDL: discriminative dictionary learning; IGF: iterative guided filter; SGM-SO: second-order semi-global matching; MSTCD2: iterative color-depth.

In addition, the quantitative evaluation is successively presented according to two aspects: (i) length comparison of the switch arm under different distances and angles and (ii) accuracy comparison of the angle under different distances and angles. The calculation of the arm’s length and angle is based on 3D coordinate transformation, which is shown in formula (17). Table 4 presents the length comparison of the switch arm calculated in 3D space by five stereo matching methods and nine different situations with regard to distance and angle. In actual measurement, the outermost two screws on the arm are regarded as the end points of arm and the true length is 20.8 cm. As can be seen from the table, our method can estimate accurate and smooth disparity map and the data are closer to the true length in comparison to other methods. In WCPSP, the data are abnormal in condition 1.5/(5,11) and 2/11 because there are outliers in disparity, resulting in huge disparity jump among adjacent pixels. Table 5 illustrates the angle calculation of switch arm by five stereo matching methods. At the angle of 0° and distance of 1 m, 0.95° and 0.60° are calculated by WCPSP and SGM-SO methods, respectively. Compared to the two methods, our method has great advantages and the accuracy is increased nearly by 10 times.

Table 4.

Length calculation of the switch arm under different distances (cm) and angles (°).

Distance/angle	1/0	1/5	1/11	1.5/0	1.5/5	1.5/11	2/0	2/5	2/11
WCPSP	20.73	20.64	20.62	20.53	139.69	131.15	20.05	20.66	54.01
MSTCD2	20.35	20.22	20.18	20.27	20.19	21.34	19.92	19.48	19.34
DDL	20.47	20.64	20.42	20.53	20.67	20.16	19.59	19.52	20.3
SGM-SO	20.38	20.57	20.29	20.63	20.40	19.98	19.78	19.56	20.18
Proposed	20.92	20.83	20.81	20.83	20.70	20.45	20.95	20.66	20.74

WCPSP: weighted cost propagation with smoothness prior; DDL: discriminative dictionary learning; SGM-SO: second-order semi-global matching; MSTCD2: iterative color-depth.

Table 5.

Angle calculation of isolating switch under different distances and angles.

Distance/angle	1/0	1/5	1/11	1.5/0	1.5/5	1.5/11	2/0	2/5	2/11
WCPSP	0.95	5.07	11.30	0.37	14.76	13.23	1.10	6.16	8.78
MSTCD2	1.28	7.23	12.08	0.37	7.14	11.68	0.32	7.86	9.88
DDL	0.97	5.07	12.30	0.43	6.01	11.83	2.36	7.35	11.20
SGM-SO	0.60	6.26	12.08	0.68	7.14	11.68	1.38	7.86	12.02
Proposed	0.09	5.07	11.29	0.37	4.90	11.83	1.10	6.16	11.20

WCPSP: weighted cost propagation with smoothness prior; DDL: discriminative dictionary learning; SGM-SO: second-order semi-global matching; MSTCD2: iterative color-depth.

For more intuitively demonstrating the accuracy of each method, absolute errors are calculated. The absolute error Δε is defined as follows

Δ ε = | ε_{m} - ε_{c} |

where $ε_{m}$ is true value and $ε_{c}$ is the calculated value.

The absolute error line chart of Tables 4 and 5 are shown in Figure 8(a) and (b), respectively. As can be seen from Figure 8, WCPSP has the worst performance. With distances increase, the absolute errors become unstable and are more likely to be worse. This is apparent according to the upward trend curves by the five methods. However, our method can still control the errors of length and angle within 0.35 cm and 1.16° and keep the most stable performance.

Figure 8.

Absolute error comparison among five methods: (a) absolute error comparison of switch arm’s length and (b) absolute error comparison of switch angle. WCPSP: weighted cost propagation with smoothness prior; DDL: discriminative dictionary learning; SGM-SO: second-order semi-global matching; MSTCD2: iterative color-depth.

Conclusion

To deal with the safety hazard caused by switch disconnection in power system, an efficient binocular stereo vision-based state measurement system is proposed in this article. The system is carried out in the aspects of camera calibration and rectification, improved SGM algorithm, and the center line extraction of switch arm. In particular, data- and structure-driven cost volume fusion and weighted- and edge discriminated-smoothness prior are proposed in SGM-based stereo matching. The cost volume fusion strategy can neutralize the shortcoming of single data-driven cost and improve the robustness of dissimilarity measures. Meanwhile, the small-weight-accumulation problem in weak-textured regions can be suppressed effectively by the proposed edge weighted tree structure, and the discriminated-smoothness prior construct appropriate regularization terms by enforcing different penalties on different cost layers such that both the slanted planes and disparity discontinuities can be handled well. In terms of line detection, after the rough positioning of the isolating switch by implementing SVM algorithm and Hog eigenvector of switch, random forest method is employed to detect the contours of switch image due to its stability in combination with Hough transform. Then, feature-constrained Hough transform is adopted to remove chaotic lines from background and extract the center line of the switch arm. By constructing binocular stereo vision framework, the disparity map obtained by semi-global stereo matching and the center line of switch arm are combined that whether the switch is fully closed can be determined based on triangulation principle. Obviously, accurate measurement is also related to the performance of stereo calibration. We adopt Zhang’s calibration method for calculating the intrinsic and extrinsic parameters of cameras. Experimental results on Middlebury v.3 data set prove that the proposed matching method has great advantages and achieves more accurate disparity maps compared with other advanced methods. Moreover, from the quantitative measurement analysis on switch images, we find that our system can control the measurement errors of length and angle within 0.35 cm and 1.16° and keep the most stable performance.

For the convenience of the experiments and the validation, the current work is carried out only in a reasonably laboratory environment, its robustness in real-world scenarios needs to be verified in future research. Moreover, the dynamic processing of visible sequence images captured from closing process and their further cooperatively processing with infrared images can also be exploited for more effective state continuous monitoring of isolating switches.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Key R&D Program of China (no. 2018YFC0406900), the Key Research and Development Program of Jiangsu Province (no. BE2018092), and the State Grid Jiangsu Electric Power Co., Ltd Science and Technology Project (no. J2019104).

ORCID iD

Jinxin Xu

References

Pylvanainen

Nousiainen

Verho

. Studies to utilize loading guides and ANN for oil-immersed distribution transformer condition monitoring. IEEE Trans Power Deliver 2007; 22(1): 201–207.

Maraaba

Al-Hamouz

Al-Duwaish

Estimation of high voltage insulator contamination using a combined image processing and artificial neural networks. In: 2014 power engineering & optimization conference, PEOCO 2014, Langkawi, Malaysia, 24–25 March 2014, pp. 214–219. Piscataway, NJ, USA: IEEE.

Pernebayeva

Bagheri

James

. High voltage insulator surface evaluation using image processing. In: 2017 international symposium on electrical insulating material, ISEIM 2017, Toyohashi, Japan, 11–15 September 2017, pp. 520–523. Piscataway, NJ, USA: IEEE.

Xie

Liu

Sforna

, et al. On-line physical security monitoring of power substations. Int Trans Electr Energy Syst 2016; 26(6): 1148–1170.

Liao

. A robust insulator detection algorithm based on local features and spatial orders for aerial images. IEEE Geosci Remote Sens Lett 2015; 12(5): 963–967.

Bin

Hao

Wenguang

. Study on the method of switch state detection based on image recognition in substation sequence control. In: 2014 international conference on power system technology, POWERCON 2014, Chengdu, China, 20–22 October 2014, pp. 2504–2510. Piscataway, NJ, USA: IEEE.

Jin

Zhang

. Contamination grades recognition of ceramic insulators using fused features of infrared and ultraviolet images. Energies 2015; 8(2): 837–858.

Junzhou

Xueyi

. Research of isolator’s status image recognition based on Gabor wavelet and SVM. In: 2010 3rd international conference on advanced computer theory and engineering, ICACTE 2010, Chengdu, China, 20–22 August 2010, pp. 65–68. Piscataway, NJ, USA: IEEE.

Liu

Yong

Liu

, et al. The method of insulator recognition based on deep learning. In: 2016 4th international conference on applied robotics for the power industry, CARPI 2016, Jinan, China, 11–13 October 2016, pp. 1–5. Piscataway, NJ, USA: IEEE.

10.

Xia

Tie

Shi

, et al. Insulator recognition based on mathematical morphology and Bayesian segmentation. In: 2017 10th international congress on image and signal processing, biomedical engineering and informatics, CISP-BMEI 2017, Shanghai, China, 14–16 October 2017, pp. 1–5. Piscataway, NJ, USA: IEEE.

11.

Zuo

Qian

, et al. An insulator defect detection algorithm based on computer vision. In: IEEE international conference on information and automation, ICIA 2017, Macau, China, 18–20 July 2017, pp. 361–365. Piscataway, NJ, USA: IEEE.

12.

Potnuru

Bhima

. Curvelet transform based statistical pattern recognition system for condition monitoring of power distribution line insulators. Innov Electron Commun Eng 2018; 7: 311–317.

13.

Zhang

, et al. The automatic identification method of switch state. IJSSST 2016; 17(25): 21.1–21.4.

14.

Hui

Zhang

, et al. A condition monitoring algorithm based on image geometric analysis for substation switch. In: International conference on intelligent computing and Internet of Things, Harbin, China, 17–18 January 2015, pp. 72–76. Piscataway, NJ, USA: IEEE.

15.

Jadin

Taib

. Recent progress in diagnosing the reliability of electrical equipment by using infrared thermography. Infrared Phys Technol 2012; 55: 236–245.

16.

Ullah

Yang

Khan

, et al. Predictive maintenance of power substation equipment by infrared thermography using a machine-learning approach. Energies 2017; 10(12): 1987.

17.

Ullah

Khan

Yang

, et al. Deep learning image-based defect detection in high voltage electrical equipment. Energies 2020; 13(2): 392.

18.

Yang

Wang

Zhang

, et al. Analysis on location accuracy for binocular stereo vision system. IEEE Photonics J 2018; 10(1): 7800316.

19.

Zhifeng

Zhigang

, et al. 3D pose estimation of large and complicated workpieces based on binocular stereo vision. Appl Optics 2017; 56(24): 6822.

20.

Wang

Liu

Wang

, et al. Research situation and development trend of the binocular stereo vision system. In: 2017 1st international conference on materials science, energy technology, power engineering, MEP 2017, Hangzhou, China, 15–16 April 2017, pp. 020220-1–020220-5. AIP Publishing.

21.

Weimin

Siyu

Hui

. High-precision method of binocular camera calibration with a distortion model. Appl Optics 2017; 56(8): 2368–2377.

22.

Sun

Yang

, et al. Camera calibration and its application of binocular stereo vision based on artificial neural network. In: 2016 9th international congress on image and signal processing, biomedical engineering and informatics, CISP-BMEI 2016, Datong, China, 15–17 October 2016, pp. 761–765. Piscataway, NJ, USA: IEEE.

23.

Liu

Huo

Zhou

, et al. A novel camera calibration method for binocular vision based on improved RBF neural network. In: 2017 2nd CCF Chinese conference on computer vision, CCCV 2017, Tianjin, China, 11–14 October 2017, pp. 439–448. Berlin, German: Springer.

24.

Jing

Yin

, et al. Calibration and stereo image correction for binocular vision system with divergent optical axes arrangement. In: 2013 6th international congress on image and signal processing, CISP 2013, Hangzhou, China, 16–18 December 2013, pp. 729–734. Piscataway, NJ, USA: IEEE.

25.

Jie

Jianxun

. Dense stereo matching based on region growing. Robot 2017; 39(2): 182–188.

26.

Taniai

Matsushita

Sato

, et al. Continuous 3D label stereo matching using local expansion moves. IEEE Trans Pattern Anal 2018; 40(11): 2725–2739.

27.

Zheng

, et al. Confidence-based iterative efficient large-scale stereo matching. Cogent Eng 2018; 5(1): 1427676.

28.

Nie

Zhang

, et al. Three dimensional point cloud hole repairing strategy for binocular stereo reconstruction. In: 2017 IEEE international conference on robotics and biomimetics, ROBIO 2017, Macau, China, 5–8 December 2017, pp. 2456–2461. Piscataway, NJ, USA: IEEE.

29.

Johnson-Roberson

Bryson

Friedman

, et al. High-resolution underwater robotic vision-based mapping and three-dimensional reconstruction for archaeology. J Field Robot 2017; 34(4): 625–643.

30.

Fokkinga

. The Hough transform. J Funct Program 2011; 21(2): 129–133.

31.

Scharstein

Szeliski

. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int J Comput Vision 2002; 47(1–3): 7–42.

32.

Yao

Zhang

Xue

, et al. AGO: accelerating global optimization for accurate stereo matching. MultiMedia Model 2018; 10701: 67–80.

33.

Zhu

Gao

. Stereo matching algorithm with guided filter and modified dynamic programming. Multimedia Tools Appl 2015; 76(1): 1–18.

34.

Guo

Han

. Graph cut stereo matching based on region consistency. J Northwest Polytech Univ 2017; 35(1): 160–163.

35.

Wang

Tang

, et al. A combined back and foreground-based stereo matching algorithm using belief propagation and self-adapting dissimilarity measure. Int J Pattern Recognit Artif Intell 2018; 32(6): 18.

36.

Yin

Zhu

Yuan

, et al. Sparse representation over discriminative dictionary for stereo matching. Pattern Recognit 2017; 71: 278–289.

37.

Zbontar

Lecun

. Computing the stereo matching cost with a convolutional neural network. In: 2015 IEEE international conference on computer vision and pattern recognition, CVPR 2015, Boston, MA, USA, 8–10 June 2015, pp. 1592–1599. Piscataway, NJ, USA: IEEE.

38.

Zhang

Lafruit

. Cross-based local stereo matching using orthogonal integral images. IEEE Trans Circ Syst Video 2009; 19(7): 1073–1079.

39.

Hong

Kim

. A local stereo matching algorithm based on weighted guided image filtering for improving the generation of depth range images. Displays 2017; 49: 80–87.

40.

Chen

, et al. Efficient methods using slanted support windows for slanted surfaces. IET Comput Vis 2016; 10(5): 384–391.

41.

Hirschmuller

. Stereo processing by semiglobal matching and mutual information. IEEE Trans Pattern Anal 2007; 30(2): 328–341.

42.

Mei

Sun

Zhou

, et al. On building an accurate stereo matching system on graphics hardware. In: 2011 IEEE international conference on computer vision, ICCV 2011, Barcelona, Spain, 6–13 November 2011, pp. 467–474. Piscataway, NJ, USA: IEEE.

43.

Liu

Wang

. 3D entity-based stereo matching with ground control points and joint second-order smoothness prior. Vis Comput 2015; 31(9): 1253–1269.

44.

Liu

Peng

Qiao

. Window-based three-dimensional aggregation for stereo matching. IEEE Signal Process Lett 2016; 23(7): 1008–1012.

45.

Chuang

Ting

Jaw

. Dense stereo matching with edge-constrained penalty tuning. IEEE Geosci Remote Sens Lett 2018; 15(5): 664–668.

46.

Yao

Zhang

Xue

, et al. As-global-as-possible stereo matching with adaptive smoothness prior. IET Image Process 2019; 13(1): 98–107.

47.

Yang

. Stereo matching using tree filtering. IEEE Trans Pattern Anal 2015; 37(4): 834–846.

48.

Zhou

, et al. Texture discrimination-enforced matching cost computation and smoothness-weighted cost regularization for stereo matching. J Electron Imaging 2019; 28(5): 053025.

49.

Dollár

Zitnick

. Structured forests for fast edge detection. In: 2013 IEEE international conference on computer vision, ICCV 2013, Sydney, NSW, Australia, 1–8 December 2013, pp. 1841–1848. Piscataway, NJ, USA: IEEE.

50.

Tribak

Moughyt

Zaz

, et al. Remote QR code recognition based on HOG and SVM classifiers. In: International conference on informatics and computing, ICIC 2016, Mataram, Indonesia, 28–29 October 2016, pp. 137–141. Piscataway, NJ, USA: IEEE.

51.

Zhang

. A flexible new technique for camera calibration. IEEE Trans Pattern Anal 2000; 22(11): 1330–1334.

52.

Wang

Huang

, et al. Invalid-point removal based on epipolar constraint in the structured-light method. Opt Laser Eng 2018; 105: 173–181.

53.

Scharstein

Szeliski

. Middlebury stereo evaluation - ver. 2, http://vision.middlebury.edu/stereo/eval (2002, accessed 22 December 2019)

54.

Yang

. Local smoothness enforced cost volume regularization for fast stereo correspondence. IEEE Signal Process Lett 2015; 22(9): 1429–1433.

55.

Yao

Zhang

Xue

, et al. Iterative color-depth MST cost aggregation for stereo matching. In: 2016 IEEE international conference on multimedia and expo, ICME 2016, Seattle, WA, 11–15 July 2016, pp. 1–5. Piscataway, NJ, USA: IEEE.

56.

Hamzah

Ibrahim

Abu Hassan

. Stereo matching algorithm based on per pixel difference adjustment, iterative guided filter and graph segmentation. J Vis Commun Image R 2017; 42: 145–160.

57.

Hamzah

Kadmin

Hamid

, et al. Improvement of stereo matching algorithm for 3D surface reconstruction. Signal Process Image 2018; 65: 165–172.

58.

Chang

Chen

. Pyramid stereo matching network. In: 2018 31st IEEE/CVF conference on computer vision and pattern recognition, CVPR 2018, Salt Lake City, UT, 18–23 June 2018, pp. 5410–5418. Piscataway, NJ, USA: IEEE.

59.

Liu

, et al. Second-order semi-global stereo matching algorithm based on slanted plane iterative optimization. IEEE Access 2018; 6: 61735–61747.