Abstract
Aiming at the limited fitting conditions of existing 3D circle feature extraction algorithms, a 3D circle feature extraction algorithm based on nonparametric estimation is proposed. Firstly, kd-tree is used to establish topological relationship and define the distance threshold. Edge points are extracted by K-nearest neighbor search. Then use DBSCAN (Density-Based Spatial Clustering of Applications with Noise) to cluster edge points for calculate the maximum distance between edge points in the cluster. Finally, the optimal interval of the maximum distance value is calculated using nonparametric estimation. The center point is obtained by extracting the circular hole feature points. The degree of dispersion from the edge point in the cluster to the center point is analyzed to determine whether the cluster is a three-dimensional circle feature. The algorithm is verified using the grid assembly printed parts (arch circular hole) and rectangular flange (plane circular hole) as the test workpiece. The results show that the algorithm can effectively extract three-dimensional round hole features.
Introduction
Reverse engineering has become increasingly popular in industrial inspection in recent years. The point cloud data of workpiece is usually collected using laser scanning, and the non-contact detection of the workpiece is completed through data processing.1–3 The 3D circle feature in the point cloud model is an essential geometric characteristic with substantial research implications for model reconstruction and reverse engineering. As a result, the point cloud model circle feature extraction has both theoretical and practical utility.4–8
Some researchers extract model features using local neighborhood characteristics and geometric properties of point clouds to perform feature extraction in point cloud models. Shi and Wang 9 constructed boundary point set removal and merging criteria as well as candidate feature point set smoothing criteria to identify feature points from boundary point sets of each segment. However, the recognition process of the method is more difficult for locations with weak feature strength. Xue et al. 10 used the least square fitting method to recognize the feature of the three-view slices of the point cloud model. When the point cloud distribution was uneven, the detection accuracy of the point cloud model will be affected. Demarsin et al. 11 used the first order segmentation to extract candidate feature points, and recovered the sharp feature lines by visually processed. But this approach cannot operate effectively in distributed point cloud models. Wang et al. 12 segmented point cloud data using a clustering technique and extracted point cloud feature information in each subregion with defined bounds. However, the feature line extraction is still broken or partial in the point cloud model with additional curve feature information. For multiparameter K-means clustering, Wang et al. 13 presented an adaptive point cloud reduction technique. In the closed model, the approach has a considerable simplification effect. But the computation of characteristic parameters was complicated. Deep learning had been used to extract edges and recognize holes in point clouds by Tabib et al, 14 Qi et al, 15 Wang et al, 16 Bazazian et al, 17 Zhang et al. 18 When processing large-scale point cloud data, this approach required high computer performance.
This paper offers a 3D circle feature extraction approach based on nonparametric estimation to address the concerns of the aforementioned point cloud feature extraction accuracy and constrained fitting conditions. Distance density analysis is used to extract the edge point set, DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is used to achieve edge point clustering. When the number of clusters to be formed is unknown, DBSCAN can recognize clusters of arbitrary shape in noisy environments, and the order of data input has little effect on the results. Nonparametric estimation is used to provide the ideal interval of the maximum distance value, and feature points are extracted for circular feature identification. The detection process is outlined in Figure 1.

3D round hole feature detection process.
Point cloud model preprocessing algorithm
The preprocessing of point cloud model is a key step in detection. The purpose is to simplify the point cloud data and improve the computational efficiency and detection accuracy on the premise of maintaining the geometric characteristics of the point cloud. The topological relationship is established to realize the data indexing and denoising of 3D point cloud.
Establishment of topological relationships
3D point cloud is irregularly distributed in space and has no positional relationship. In the process of hole edge extraction and recognition, it is necessary to create point cloud data index and point cloud topology relationship to improve the processing efficiency of point cloud model. The kd-tree approach is utilized to build the spatial topological relationship of point clouds in this article, and K-Nearest Neighbor (KNN) is used to find the k-nearest neighbor. 19
Extraction of point cloud edges
When the line laser is scanning and modeling, there are typically noise spots due to interference from the surroundings. To prepare for future point cloud feature extraction, point cloud noise reduction is required during the preprocessing stage. In this study, radius filtering and statistical filtering are employed to deal with discrete point cloud noise.
The primary goal of point cloud edge extraction is to eliminate unnecessary points and save calculation time. The point set P = {pi | pi
Calculate the density of a point cloud
After searching for the nearest k points of all points with KNN, compute the distance D i between point p i and its nearest k points, and then average them all to obtain the point cloud density ρ(p i ). This is shown in (1):
Set the distance threshold
The distance threshold n is calculated using (2)
where u is the threshold coefficient. To adjust the threshold n,
(1) Set the threshold coefficient μ and calculate the distance threshold n.
(2) Perform edge recognition and effect display.
(3) Using the edge extraction effect as a guide, adjust the threshold coefficient μ to produce the ideal threshold n with the best extraction effect.
Extraction of point cloud edges
Find each point p i and its nearest k points, compute the distance D i between point p i and its nearest k points, and find a value larger than the threshold n in D i (i = 1,2…,k). The edge point q i (x,y,z) corresponding to the distance D j is designated as the distance point D j and saved in a new point set E(q i ). The interior points are allocated a value of 1, while the edge points are assigned a value of 0. This is shown in (3):
Update the threshold coefficient
After traversing all the points and identifying the edge points, the edge points are visualized, and the effect of edge extraction is analyzed. If the edge extraction is not obvious, return to Step (2) and adjust the value of the threshold coefficient μ.
Cluster analysis of point clouds
After acquiring the edge dataset E(q i ), the DBSCAN technique is used to perform clustering analysis on the data to efficiently categorize edge points and decrease the computation of circular feature extraction E(q i ). 20
In the DBSCAN algorithm, there are two important parameters: epsilon and Minpts, where epsilon is the neighborhood radius when defining the density, and Minpts is the minimum number of data points within the radius of the sphere. The algorithm principle is as follows:
The smiley face point cloud model was used for verification. Set epsilon to 1 and Minpts to 4. The algorithm diagram is shown in Figure 2(a). As seen in Figure 2(b), the points with different results are marked with different colors, and those without colors are discrete points.

DBSCAN clustering algorithm: (a) schematic diagram of algorithm principle and (b) clustering results.
Point cloud circle feature recognition and extraction algorithm
Firstly, the maximum distance feature in the cluster is calculated after clustering. Secondly, the distribution function of maximum distance feature is optimized using nonparametric estimation and the feature histogram is established. The hole feature is extracted and the circular center position is calculated. Finally, the rationality of detecting circular hole features is determined by the degree of dispersion from the edge point to the center position.
Circle feature density histogram interval calculation
In this research, a maximum distance distribution technique based on nonparametric estimation 21 is suggested to separate point cloud edge shape features and noise points for circular feature detection and extraction, taking into account the complicated porous structure.
After clustering the edge of point cloud E(q i ), the Cluster C = {C l | C l , l = 1,2,3,···, m} is produced, where p i (C l ) is the coordinate of a point in C l . Calculate the greatest point-to-point Euclidean distance in C l using (4):
Continue to identify the two points with the greatest distance among the remaining points by storing p i (C l ) and p j (C l ) in sets A and B, respectively. f(Ai, Bi) → MaxD i is the equivalent relationship between the points and the maximum distance. The diameter is the maximum point distance, and f(Ai, Bi) → MaxD i may be used to locate the circular feature’s stable point.
To evaluate the degree of data density, the density histogram usually has to assume parameters. 16 This work uses the probability density function to show the degree of sparseness of point cloud data and nonparametric function estimation to determine the interval of the probability density function to better identify feature points from nonfeature points.
The error function Err(x) is defined as the optimization criterion for nonparametric estimation to make the fitted probability function more accurate, as demonstrated in (5):
where y is the actual value, and
Err(x) is represented by the function deviation Bias and variance Var, where Bias is the discrepancy between the fitted function and the real function, and Var is the degree of dispersion of the fitted values since f(x) and ε cannot be calculated. As shown in (7):
After setting the optimization standard, estimate the density histogram with the nonparametric function to obtain the ideal interval width h*. The steps of the algorithm are as follows:
(1) First, as shown in Figure 3, the distance between two points in cluster C l is computed and shown as a distance density histogram.
(2) The interval is set to B
k
, interval width to h, probability function to v
k
, and related probability density function to
(3) Because each interval and distance data have a one-to-one relationship, the probability function
(4) The actual distance distribution function f(x) exists and is determined. Then, the expectation of f(x) is itself. Therefore, the expectation and variance of
(5) According to the median theorem shown in equation (12), the variance is integrated to calculate the

Diagram of density histogram.
According to Taylor series expansion formula (12), as follows:
Substituting into equation (10), as follows:
According to the generalized integral median theorem, it can be calculated as:
Integrating the squared bias of equation (16) yields the
Calculate AMISE (Asymptotic mean integrated squared error) by combining formulas (13) and (17) into formula (7), and when Err(x) is the minimum, get the best solution h* of parameter h:
(6) The type of distance histogram distribution is determined. After analysis, the histogram of the distance between two points in cluster C l is an approximately normal distribution. Then, f ∼ N(μ,σ 2 ), where μ, σ are determined according to the actual distribution of the data, and the ideal interval h* is derived, as indicated in (12):
Circle feature recognition and extraction
A maximum distance feature density histogram is created once the aforementioned approach is executed, as shown in Figure 4.

Distance feature histogram.
Set a threshold T based on the sparseness of feature points, extract feature points, and record the number n of feature points. The feature estimation has the most stability when T = 0.5, according to the experimental effect study. The cluster C l ’s center point O l (x l , y l , z l ) is thus represented in (21):
Where N is The number of points in cluster C l , and the extracted feature point is p i (MaxD). This is determined if Cluster C l is a 3D circular feature based on the discrete degree σ of the distance from each point in Cluster C l to O l (x l , y l , z l ).
As indicated in (22), the distance probability density histogram can be μ and σ.
Assuming that Cluster C
l
is a circle feature, the radius
The judgment condition of whether it is a circle feature is shown in (24):
When J = 1, the cluster C l is found to be a circular feature cluster by the formula. The feature points are those that fulfill the maximum distance interval h*.
Randomly select a point cloud cluster of a hole. Figure 5(a) is the effect diagram after denoising, edge point extraction and point cloud clustering. It can be seen that there are still many discrete points around the hole. Figure 5(b) shows the distribution of circular hole feature points extracted from the maximum distance feature density histogram optimized by nonparametric estimation. The distribution of red dots is relatively uniform. The result indicates that the feature points of circular hole can be accurately extracted and increase the accuracy of circular hole discrimination.

Distribution diagram of feature points: (a) feature points distribution and (b) optimized feature points distribution.
Comparison and analysis of tests
Construction of test platform
A binocular laser scanner, high-precision motion slide, liftable screw, motion motor control module, and optical precision damping vibration isolation platform comprise the 3D point cloud scanning detection system employed in this research, as shown in Figure 6. To realize overall workpiece scanning, a fixture is used to fix the grid assembly on the placing table, the height of the z-axis is set, the x-axis is scanned repeatedly with the dual-line laser scanner, and the y-axis of the placing table is quantitatively moved. The entire scanning system is fixed on an optical precision damping vibration isolation platform to reduce the influence of the slight ground vibration on the scanning results.

3D point cloud scanning detection system. 1 – Lifting screw. 2 – Binocular laser scanner. 3 – Workpiece placement table to be tested. 4 – High-precision motion slide. 5 – Optical precision damping vibration isolation platform. 6 – Sports console.
Tables 1 and 2 illustrate the speed of the sliding table, positive direction of movement, movement distance, and fixed parameters of the line laser.
Fixed parameters of binocular laser.
Parameters of mobile slide.
Point cloud model
The surface depth information of the part is measured using double line laser sensor. The 3D coordinates of each point on the workpiece surface can be obtained through the registration of the slide encoder information of x-axis, y-axis. The point cloud dataset of the parts is constructed by registration and splicing, and 3D reconstruction is completed. The scanning scheme of point cloud model is as follows:
(1) Completely scan the analyte in the positive
(2) Calculate the coordinates of all scanned data points using the high-precision sliding table coordinate system, filter out any redundant points that have no information, and acquire coordinate information for all points;
(3) The
Test design
A comparative experiment of 3D circle feature extraction is planned to validate the 3D circle feature recognition and extraction technique proposed in this paper. First, the edge of the point cloud is extracted for the point cloud model of grid assembly and plane rectangular flange. Second, a circular hole was randomly selected, a density histogram was constructed using different interval widths, and the feature probability density function was fitted to verify the influence of the density histogram interval width h on the feature extraction accuracy and to prove the necessity of nonparametric estimation optimization. And last, the extracted circle features are tested and compared using the traditional circle feature recognition technique (the Least Squares and Random Sampling Consensus (RANSAC) 22 ) and the algorithm presented in this research.
Result analysis
Figure 7(a) shows the point cloud model of grid assembly, which has a total of 1,981,594 points. Figure 7(b) shows the related edge point cloud model, which has a total of 125,785 points after edge extraction.

Grid assembly point cloud model edge extraction renderings: (a) before edge extraction and (b) after edge extraction.
The edge points are clustered to obtain 102 hole categories, as shown in Figure 8:

Renderings of hole clustering of grid assembly.
The other piece to be tested is a rectangular flange, with a total of 1,017,564 points in the point cloud model. By using a distance density based edge point extraction algorithm, a total of 12,800 points were extracted for subsequent calculations after denoising, as shown in Figure 9.

Plane flange point cloud model and edge extraction.
The edge points are clustered to obtain five hole categories, as shown in Figure 10.

Renderings of hole clustering of rectangular flange.
Randomly select a cluster as shown in Figure 11, the ideal histogram interval h* is 0.15 mm according to formula (21). And AMISE calculates Err(x) using formula (19). Calculate the density function error and create a density histogram for data comparison analysis using the interval widths h = 0.01, h = 0.3, and h = 0.6 mm, respectively. As shown in Figure 12(a)–(d), the AMISE of the corresponding characteristic density functions of h* = 0.15, h = 0.01, and h = 0.3 mm, and h = 0.6 mm are 0.0491, 0.5348, 0.0729, and 0.2292 mm2.

Randomly extracting round hole clusters.

Nonparametric estimation optimization comparison: (a) h* = 0.15, AMISE = 0.0491, (b) h = 0.01, AMISE = 0.5348, (c) h = 0.3, AMISE = 0.0729, and (d) h = 0.6, AMISE = 0.2292.
The error Err(x) of the feature probability density function under nonparametric estimation is smaller in AMISE, and the greater the distance h* = 0.15 mm, the larger the Err(x). It can be shown that the density histogram’s interval width h has an impact on the probability density function’s correctness, which in turn has an impact on feature point selection. There are 187 random circular holes in total, with the fraction of extracted feature points falling within the P (0.25 < X < 0.75) probability interval. The probability of noise in the retrieved feature points changes as the interval width h increases, as shown in Table 3. If it is large, the fraction of feature points exceeds the probability interval P, resulting in a larger function deviation and affecting the center of the circle calculation accuracy. The genuine circle center should be near h = 0.3 and h* = 0.15, according to the data in Figure 12 and the change trend of the circle center coordinates in Table 3. The z-axis coordinate data displays a declining trend due to the growing proportion of noise points involved in the calculation between the point cloud layers.
Effects of different interval widths h on the histogram and the calculation results of the center of the circle.
The choice of the interval width of the density histogram is particularly crucial to the accuracy of fitting the probability density function. A nonparametric estimation method is used to obtain the ideal interval width h*, which reduced function fitting error and improved the accuracy of 3D circular feature extraction. The circular hole recognition results of the point cloud model are plotted in Figure 13, including grid assembly and plane flanges.

The circular hole recognition results: (a) grid assembly and (b) plane flanges.
The Least Square method, RANSAC algorithm and the algorithm proposed in the paper are used for algorithm comparison test, and the results are depicted in Figure 14. Figure 14(a) and (d) demonstrate the fitting effect and position of the center of a circle in 3D space using the Least Square method. In the case of uneven edge density distribution, there are significant errors in circle fitting and center calculation. Using RANSAC algorithm, the fitting effect and position of the circle center are shown in Figure 14(b) and (e). It can be seen that due to the structure of the circular hole on the arch surface, the normal vector has a large deviation between the fitting circle and the actual circle, resulting in inaccurate identification. The proposed algorithm is used for processing. The optimal maximum distance interval is selected through nonparametric estimation. The optimal circle feature points are extracted and the center of the circle is calculated, as illustrated in Figure 14(c) and (f). The discrete degree of the distance from the edge point to the center by the three methods is 1.3354, 0.3925, and 0.0417, respectively. The precision of center coordinates obtained by this method is obviously better than that of the Least Square method and RANSAC algorithm, and 3D circular hole feature can be accurately identified and extracted. Compared with the Least Square method and RANSAC algorithm, the method can reduce the noise of edge data, so as to obtain better feature extraction effect. Therefore, the proposed algorithm in this paper has better accuracy in the case of uneven distribution of point clouds and noise points in practical applications.

Comparison of circle feature recognition results: (a and d) circle fitted by Least Square method and display of calculated circle center in 3D space, (b and e) circle fitted by RANSAC algorithm and calculated center of circle are displayed in 3D space, and (c and f) circle feature points extracted by paper algorithm and display of calculated circle center in 3D space.
Finally, the accuracy of the algorithm is verified by comparing with the measured value of micrometer plug gage. Several groups of data are randomly selected to test the diameter measurement error. The results are shown in Tables 4 and 5. The measured values of micrometer plug gage and the calculated values of the algorithm in this paper are given in the tables. The last three columns of data are the xyz coordinates of the circle center by the algorithm. Compared with the measured value of the plug gage, the average measurement error of arch round hole of the algorithm in this paper is 11.5 μm, the maximum measurement error is 26 μm, and the standard deviation is 8.1 μm. The average measurement error of plane circular hole is 11.4 μm, the maximum measurement error is 25.6 μm, and the standard deviation is 7.8 μm. The results show that the proposed algorithm is suitable for both arch and plane circular holes, and it has high precision.
Arch round hole test results.
Plane circular hole test results.
Conclusion
Using nonparametric estimation methods, the paper proposes an algorithm for 3D round hole feature detection based on point cloud feature extraction. The algorithm effectively avoids the calculation deviation caused by traditional algorithm circle fitting. It can be seen from Figure 14 that compared with the Least Squares method and RANSAC algorithm, the center coordinates obtained by this algorithm are more accurate and the extraction of edge data is more effective. Thus, the precision of 3D circular hole feature extraction is improved.
By building an experimental testing platform, algorithm testing was conducted on the grid assembly device printed parts (arch round hole) and rectangular flange (plane round hole). The average error of the diameter detection results of the circular hole on the arch surface is 11.5 μm. The standard deviation is 8.1 μm. The average error of the diameter of flat circular hole is 11.4 μm. The standard deviation is 7.8 μm. The results indicate that the proposed algorithm can effectively extract 3D circular hole features from point clouds with dispersed circular holes on arch surfaces and planar surfaces. The research results are expected to provide a certain theoretical basis and technical support for automatic detection of 3D circular hole feature extraction.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The authors would like to acknowledge financial support provided by the Scientific and Technological Project of Songjiang District in Shanghai (20SJKJGG08C) and Jiangsu Provincial Engineering Research Center of Robot and Intelligent Manufacturing Equipment Open Subject (SDGC2254).
Data availability statement
Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.
