Abstract
This article presents a fast and robust plane segmentation approach for RGB-D type sensor, which detects plane candidates by line segments extracted from 2-D scanline projected from row or column points. It neither requires high computation to calculate local normals for the entire point cloud as most of approaches do nor randomly chooses plane candidates such as RANSAC-like approaches. First, a line extraction algorithm is utilized to extract line segments. Second, the plane candidates are detected by estimating local normal of points lying on line segments. Finally, the plane having most inlier is recursively extracted from the plane candidates as the result plane. Experiments were conducted with different data sets and the segmentation performances were evaluated quantitatively and qualitatively. We demonstrated the efficiency and robustness of our proposed approach, especially in the none plane scenario, the approach needs little computational cost.
Introduction
An indoor scenario usually consists of many 3-D planar surfaces, such as walls, tabletops, ceilings, and floors, which would benefit a wide range of robotic applications, especially tabletop objection detection, 1,2 planar simultaneous localization and mapping (SLAM), 3 –5 and semantic mapping. 6 In manipulation application, the plane is usually detected to obtain background of workspace. While in navigation, the plane feature is more compact and contains more structure information, compared to other types of feature, such as point, line, arc, and so on.
The availability of inexpensive RGB-D sensor (Microsoft Kinect, PrimeSense, and ASUS XtionPro) has made available dense 3-D point clouds, which were previously only accessible using much more expensive sensors like time-of-flight cameras or scanning 3-D laser range finders. There has been much interest in plane segmentation using RGB-D sensor beforehand. Beyond the existing plane extraction approaches, 7 –9 we present a new plane extraction method based on line primitive. Considering the organized property of point cloud from RGB-D sensor, we utilize line extraction algorithm efficiently on row or column cloud points. Lines can be treated as the subset of the plane features. Thus, by estimating the normal of the point lying on line segment, we can find the plane candidates and then extract planar regions by a recursive extraction framework.
The rest of this article is organized as follows: The “Related work” section discusses previous related work to our research. The “Plane segmentation approach” section presents the proposed novel plane segmentation approach based on line primitives. Experimental results and a comparison of three typical approaches with the proposed approach are presented in the “Experiment” section. The final section gives the conclusion.
Related work
Several approaches to plane segmentation of point cloud data have been proposed in literature. Rabbani 7 et al. classified segmentation algorithms into three main varieties: edge-based segmentation, surface-based segmentation, and scanline-based segmentation. These three main varieties differ in the method or criterion used to measure the similarity between a given set of points for making the grouping decisions.
In recent years, the inexpensive RGB-D sensor can obtain both RGB and depth images at a frame rate up to 30 Hz in VGA resolution. The organized property of depth image provides a chance to address the plane segmentation problem beyond traditional approaches. Approaches that exploit organized point cloud data have been developed. 9 –11
The RANSAC-like 8 procedures are widely used by the open source Point Cloud Library. 12 These approaches can segment plane accurately and shows efficiency while applied on unorganized data or data that are downsampled from organized point cloud, but is much slower than approaches that can exploit organized data. Some RANSAC-based approaches have been extended to exploit organized structure, such as the approach of Biswas and Veloso, 13 which enables real-time performance. Hulik et al. 14 proposed a tiled RANSAC for the ground plane search by taking into account only small areas of the scene in depth image. Because of its small random sample search, the tiled RANSAC can be used in real-time systems, which can reach a speed of multiple frames per second. However, RANSAC tends to oversimplify complex planar structures, for example, multiple small steps were often merged into one sloped plane. 15
Rabbani et al. 7 presented a region-growing-based approach using smoothness constraint, which finds smoothly connected areas in point clouds by utilizing local surface normals and point connectivity which can be enforced using either k-nearest or fixed distance neighbors. It shows effectiveness on unstructured point cloud. Similar approaches are presented by Weingarten et al., 16 Poppinga et al., 10 and Brenneke et al. 17
Organized point cloud or range image segmentation has also been well investigated in various approaches. Poppinga et al. 10 developed a region-growing-based approach that selects a random point and incrementally updates the centroid and covariance matrix as points are added. Holz and Behnke 11 extended this approach by precomputing surface normals for the cloud and incrementally updating the plane’s normal equation, which further reduces the computational cost. Trevor et al. 9 proposed an approach that does not utilize seed points for region growing but instead processes each point sequentially and does plane fitting after the image has been fully segmented. Both tabletop object segmentation and indoor mapping show the efficiency and accuracy of the algorithm. Salas-Moreno et al. 5 presented a similar approach to detect planes using connected component labeling. 18
Guan et al. 19 treat the range image as a Markov random field and solve the association problem using graph-based global energy minimization, which encapsulates both appearance cues from the RGB (color) channels and shape cues from the D (depth) channel. This approach suggests significant segmentation quality at genuine plane edges and plane intersections and also automatically fills in missing depth information. Matsumoto et al. 20 proposed a piecewise plane fitting approach based on normal adaptive segmentation and graph component labeling to reduce noise and holes in depth map captured with an RGB-D camera. It can work in real time by utilizing highly parallel processing capabilities of graphics processing unit (GPUs).
The most similar approaches to ours are the scanline-based approaches. 21 –23 In range images, each row can be considered a scanline, which can be treated independently of other scanlines in the first stage. This kind of approach detects line segments from the scanlines and then groups the adjacent lines with similar properties to form planar segments. In contrast to scanline-based approaches, our proposed approach utilizes line segments on a few scanlines to detect plane candidates, instead of line segments from all the scanlines. Moreover, it neither requires to compute normals for the entire point cloud, which is high computational cost, nor it randomly chooses plane candidates like RANSAC-like approaches do.
Plane segmentation approach
Figure 1 shows the flowchart of the proposed plane segmentation approach. The input is the organized point cloud. Several rows or columns of points are projected to get 2-D scanlines. Then, a line regression segmentation algorithm is applied to extract line segments on each scanline. Plane candidates are detected by the local normals of points lying on line segments. Finally, a recursive extraction framework that recursively extracts the plane having most inlier is designed.

Flowchart of the plane segmentation approach.
Line extraction
We select several rows of points in organized point cloud at a fixed interval. Each set of points in a row is treated as a 2-D laser scan for line segmentation. Optionally, column points cloud also be used as scanlines for line segmentation to achieve better plane segmentation performance. Here, we only discuss the case of row points, case of column points follows the similar procedure.
As shown in Figure 2, a 2-D scan point S can be obtained from point P by equations (1) and (2). To reduce the computational cost, we use equations (3) to calculate Sx providing known camera intrinsic parameters

Scan point projection. Coordinate oxy is the image frame, and p(u, v) is a point on the selected row of ranging image. OXY is camera frame or world frame, and P(x, y, z) is a point in organized point cloud. OsXsYs is the scan frame, and S(x, y) is a 2-D projected scan point. f is the focal length.
The camera intrinsic parameters of a pinhole camera model
where fx and fy represent the focal length on x-axis and y-axis in terms of pixels; cx and cy represent the principal point. Figure 3 shows an example of a 2-D scan projected from middle row points of an organized point cloud.

Example of 2-D scan projected from one row in organized point cloud: selected row is the middle blue straight line in (a) and the projected scan is the black dots in (b).
As in the study of Arras and Siegwart 24 and Siegwart et al., 25 a line regression algorithm is adopted for line segmentation. An outline of the algorithm is given in Algorithm 1. We take into account only the z-axis noise of points in organized point cloud for line segmentation, using the empirical axial noise model derived by Nguyen et al. 26 As the accurate line segmentation is not essential for the proposed approach, we omit the noise produced by the surface angle to avoid the unnecessary computational cost. Then, the noise model can be expressed as
Line regression.
For efficiency, we roughly segment line region, without any refinement step to find accurate line segments and line parameter. Each line segment contains the line region inlier for the consequent plane candidates detection.
Plane candidate detection
Planes are parameterized by
Local normal of a point is computed by performing principal component analysis (PCA).
5
A normal smoothing region is applied to search neighborhood points, such as a 20 × 20 square region centering on the selected point. The valid points are selected in the normal smoothing region as inlier. All the inlier are normalized by subtracting its mean c, and then a covariance matrix Σ and its corresponding eigendecomposition are computed. The eigenvector with minimum eigenvalue λmin becomes the local normal
An example of plane candidate detection is shown in Figure 4, where the selected rows are shown in Figure 4(a) as colored straight lines, line segments, and the plane candidates are shown in Figure 4(b) as straight lines and arrows.

Example of plane candidate detection. (a) Scanlines. (b) Line segments and plane candidates. Colored straight lines are line segments. Square regions are normal smoothing regions (20 × 20 pixels). Black arrows represent local normals of the middle point on each line segment.
Recursive plane extraction
We recursively extract the result plane with the most inlier in the point cloud as shown in Algorithm 2. In each loop, the plane candidates are checked if there is enough inlier Ns within the corresponding normal smoothing region, and candidates with
Recursive plane extraction.
Experiment
Our plane segmentation approach has been applied to several typical indoor scenarios from TUM RGB-D data set, 27 including tabletop, room floor, office corner, and complicated scene having no plane, using organized point clouds in both VGA and QVGA resolution. The tabletop scenario is widely involved in manipulation task. The room floor that contains rich structure information is the general scenario in planar SLAM application. The office corner is a scenario that consists of only planar region, while the complex scene consists of few planar region. The organized point cloud of the selected scene is constructed from range image and stored as a pcd file.
As shown in Figure 5, the proposed approach is compared with three other approaches from PCL, 12 namely, region growing segmentation (RGS), 7 RANSAC segmentation (RANSAC), and organized multiplane segmentation (OMPS). 9 All the test data are in VGA resolution, except for Figure 5(f1) to (f4), of which the input data is in QVGA resolution for the proposed approach.

Plane segmentation results of different approaches. (a1)–(a4) are four indoor scenarios. (b1)–(b4), (c1)–(c4), (d1)–(d4), (e1)–(e4), and (f1)–(f4) are segmentation results of RANSAC, RGS, OMPS, proposed, and proposed (in QVGA resolution), respectively. RGS: region growing segmentation; OMPS: organized multiplane segmentation.
In the experiment, for data in VGA resolution, the proposed approach utilized 14 row scanlines and 21 column scanlines at 30 pixels interval, line segmentation sliding window size Nf = 11, minimum line inlier Nl = 30, normal smoothing region 20 × 20 pixels, normal smoothing threshold Ns thresh = 240, point-plane distance threshold dτ = 0.02 m, point neighbor threshold dn = 0.03 m, and minimum plane inlier Nξ thresh = 5000. While for data in QVGA resolution, the proposed approach utilized 11 row scanlines and 15 column scanlines at 20 pixels interval, line segmentation sliding window size Nf = 7, minimum line inlier Nl = 23, and minimum plane inlier Nξ thresh = 1000. For RANSAC, a point-plane distance threshold 0.02 m, maximum iterations number 100, and minimum inlier number 40,000 were applied. For RGS, we utilized 100 nearest neighbors for normal estimation, 20 neighbors for local connectivity check, 3.0° for surface smoothness check, and minimum cluster size 5000. For OMPS, we utilized normal smoothing size 20, angular threshold 3.0°, distance threshold 0.02 m, and minimum inlier 5000.
Runtime result is evaluated by the average time of performing 20 plane segmentation for each organized point cloud. Also a qualitative and quantitative discussion is presented for the proposed approach. All the experiments were executed within a single thread on a laptop equipped with an Intel i7 Quad Core CPU at 2.9 GHz and 4 GB memory.
Runtime
Table 1 presents the running times for the proposed plane segmentation approach, as well as the individual steps of line extraction and plane extraction. The running time mainly depends on the complexity of the scenario. For scenarios with few planar regions, few lines are extracted and few plane candidates are detected to perform subsequent recursive extraction, while scenarios with many planar regions are opposite. Furthermore, the plane extraction procedure takes most of the total computational cost because of the recursive extraction framework.
Approximate running times for line extraction and plane extraction.a
aInput organized point cloud is in VGA resolution.
Table 2 shows the running times when different number of points are selected from each line segment. The running times increased while more points are selected, but the segmentation quantity and quality did not improve.
Approximate running times with respect to number of points selected in each line segment.
Table 3 shows the running times with respect to input point clouds in different resolutions. The proposed approach is capable of real-time implementation for point cloud in QVGA resolution, meanwhile remains the same segmentation performance as in Figure 5(f1) to (f4).
Approximate running times with respect to different data resolutions.
Table 4 compares the running times of the proposed approach with three other approaches. The proposed approach took less computational cost than RGS and RANSAC, while OMPS prevailed. Actually, RANSAC and RGS require downsampling the data to achieve reasonable performance, and thus, they consume much more computational cost using data in VGA resolution. The proposed approach needs more computation to process structured scenario, such as room corner in Figure 5(e3), but this problem can be solved by selecting fewer rows for plane candidates detection or removing reduplicated plane candidates before recursive plane extraction, while the same segmentation performance is achieved. The proposed approach takes little time for highly clustered scene, which shows its advantage over other approaches to deal with complex indoor scenario.
Comparison of running times of four approaches.
RGS: region growing segmentation; OMPS: organized multiplane segmentation.
Quantity and quality
As is shown in Figure 5(e1) to (e4), the proposed approach can correctly achieve planar segments compared to other approaches. Quantitatively, it tends to be oversegmented like RANSAC because of the recursive extraction framework. As in the corner scenario in Figure 5(e3), the floor plane is divided into two segments by a vertical plane that was extracted first. This oversegment problem inheres in the recursive extraction criterion, but can be solved by implementing a refine step to detect and merge two adjacent coplanar regions. Qualitatively, the proposed approach can achieve good plane parameters and well planar regions. However, some segmented planes do not have a good region boundary and contain outliers, e.g. the light blue plane in Figure 5(e1) which is the table surface but contains outliers on the plane boundary, the light green plane in Figure 5(e3) which is the wall but contains floor points. This is due to limitation of using only the point-plane distance threshold and neighbor distance threshold for inlier search, without computing local normals for surface smoothing checking like region-growing approach does.
Conclusion
A segmentation approach for dividing organized point cloud into a set of planar regions has been presented. The approach utilizes the line segments to detect the plane candidates and a recursive extraction framework to extract the planar regions. Compared to other approaches related to organized point cloud segmentation, it is free of computing the local normal for all the points. Unlike RANSAC, the plane candidate is not randomly chosen, instead under a line region assumption. We showed the efficiency and robustness of the proposed approach by applying it to various indoor scenarios and comparing it with three other typical approaches. The proposed approach tends to be oversegment planar regions and get false positive point inlier because of the recursive extraction criterion. Future work can add a final refine step to merge adjacent coplanar regions, remove reduplicated plane candidates and design a mechanism to self-adjust the number of selected scanlines to balance the computational cost and the segmentation performance.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research is supported by the Major Subject of Beijing Science and Technology(D141100003614002).
