Sage Journals: Discover world-class research

Abstract

Mobile robots frequently navigate through corridor environments. When implementing graph Simultaneous Localization and Mapping in a corridor environment, it is important to construct a precise pose graph. However, scan matching fails owing to the lack of features, resulting in false edges. To address the false edges issue, we propose a Validation Grid Map Network (VGM-Net) as a Convolutional Neural Network (CNN) based approach to enhance the robustness of scan matching validation. We suggested the VGM based on the Gaussian model to be used as input data for VGM-Net. In our approach, the VGM provides a grid-based Gaussian representation of raw light detection and ranging data, effectively capturing the spatial clustering and key geometric features that are essential for reliable scan matching. This enhanced representation not only preserves critical structural information but also enables efficient feature extraction within the CNN framework. This results in improved discrimination between valid and erroneous scan matches. The VGM transforms previous and current scan data into a grid map based on a Gaussian model. The VGM is computed by combining the hit ratio with the Gaussian-modeled grid map. The VGM-Net classifies scan matching failures with high accuracy by learning situations in scan matching failures and successes. We evaluated the rejection performance accuracy for false edges using an open data set and the scan data measured in a practical corridor environment. We carried out experiments in a real environment to show the performance of the VGM-Net. The experimental results demonstrate the robustness of the proposed VGM-Net approach.

Keywords

Graph SLAM scan matching validation CNN LiDAR

Introduction

Indoor mobile robots are now being employed in diverse settings, such as warehouses, industries, and ports, while autonomous vehicles face similar navigation challenges in outdoor traffic environments.¹ Maps play a crucial role in facilitating autonomous navigation in both domains, which necessitates the use of Simultaneous Localization and Mapping (SLAM) for generating maps. There are studies^2,3 that update maps when the environment has changed over the long term. Other research^4,5 focuses on tracking and mapping in the environment with numerous dynamic obstacles. The purpose of SLAM is to obtain accurate maps that are representative of the environment.

The SLAM methods^6–10 are widely used in robotics and computer vision for simultaneously estimating the location of a robot or vehicle¹¹ and constructing a map of its surroundings, particularly in places that are unfamiliar or unknown. In recent times, researchers have investigated SLAM extensively by using graph-based methodologies.^12–14 Recent advances in multisensor fusion have significantly improved SLAM robustness, particularly through tightly coupled approaches such as LIO-SAM¹⁵ and LVI-SAM,¹⁶ which integrate light detection and ranging (LiDAR), visual, and inertial sensors to achieve superior performance in challenging environments.

However, in indoor corridor environments where distinctive features are sparse, conventional scan matching often yields false edges due to the accumulation of failed matches or missing edges caused by local minima. To overcome these challenges, we propose a robust front-end validation procedure that incorporates the Validation Grid Map (VGM) method. The VGM is a grid-based Gaussian representation of 2D LiDAR data that focuses on spatial clustering and preserves critical geometric information. Constructing the front-end by applying scan matching in an indoor environment becomes challenging when the results of Graph SLAM fail owing to storage of failed scan matching edges or the storage of missing edges in the local minimum. Hence, it is important to construct a robust front-end for the validation procedure. Odometry data are required to construct the front-end.

To obtain accurate odometry data, the LIDAR scan matching method is mainly used. Accurate odometry data help construct a robust front-end. By optimizing the front-end graph with the methods of the back-end, it is possible to obtain accurate maps.

The Normal Distributions Transform (NDT)^17,18 and Iterative Closest Point (ICP)^19–21 methods have been investigated for obtaining precise LiDAR odometry data. To obtain accurate odometry data, a Gaussian model was used in conjunction with the NDT method to represent registered scan data. This modeling approach aimed to maximize the score function along the desired direction. While these Gaussian-based methods have proven effective for odometry estimation, they do not directly address the challenge of validating scan matching results in feature–sparse environments.

The ICP algorithm searches for point pairs by calculating the closest distance and, subsequently, calculates odometry data by minimizing the objective function along the optimal direction. The approach of rejecting outlier points at the nearest point was adopted to realize robust odometry, as described in Jo and Moon.²² The generalized ICP algorithm, which was proposed in Segal et al.,²³ was demonstrated to be robust owing to its utilization of a point-to-plane approach. TrICP²⁴ utilized rearranged point pairs in ascending order in the point matching stage to achieve robustness. In recent years, Kiss-ICP, which includes a motion compensation method and a subsampling scheme,²⁵ has been adapted to various real-world environments.

However, scan matching does not always ensure success. Outlier constraints are crucial to optimizing the factor graph. The lack of a validation process during the construction of the factor graph might fail the optimized factor graph. During the construction of the front-end of Graph SLAM, the presence of false-positive (FP) constraints leads to a distorted map in the back-end of Graph SLAM. The topology of the factor graph of the switchable constraint approach²⁶ is applied to obtain robustness against outlier edges. The Max mixture model²⁷ ensures robustness by characterizing the outlier of edges as non-Gaussian behavior. The GTk intermediate layer system was proposed by GraphThinker²⁸ to eliminate outlier constraints. The SubMap matching approach²⁹ is validated twice using the remove dynamic objects method and unique factor graph structure that accounts for initialization bias and employs an advanced algorithm, which enhances the robustness of Graph SLAM. The Graph-Cut RANdom SAmple Consensus method³⁰ for identifying and eliminating outliers is characterized by low computational cost and limited complexity.

The validation system of scan matching results, one of the edge validations, involves determining the outcome of scan matching and, subsequently, organizing it into edges. Although the scan matching validation system is simple, it performs consistently, especially when used with a laser sensor. To validate scan matching, the ICP covariance matrix was computed using the point-as-landmarks approach.³¹ In another study, the performance of scan matching was analyzed by focusing on the features of point clouds.³² That study examined the effects of many factors on the performance of scan matching, including overlap ratio, distance, angle, and noise of points. Recent comprehensive surveys on 3D LiDAR SLAM³³ have highlighted the critical importance of robust validation mechanisms in maintaining system reliability across diverse environmental conditions. In recent years, deep-learning-based SLAM has increasingly attracted researchers attention.^34–36 Several researchers have approached the scan matching validation challenge in graph SLAM as a classification problem.

In He et al.,³⁷ a Convolutional Neural Network (CNN) was used to effectively address the categorization problem with a notable level of accuracy. The CNN has demonstrated remarkable efficacy in many applications. However, it is challenging to integrate laser sensor datasets with CNN because the endeavor leads to the disorder problem. In Jeong and Lee,³⁸ scan matching methods were utilized to address unordered problems by training the result of scan matching on plot images. Subsequently, a validation process was executed to assess the effectiveness of this approach. As a result, it was possible to store LIDAR registration data and transform scan data as RGB image data, a straightforward and precise approach.

Recent advances in deep learning–based loop closure detection³⁹ have demonstrated the effectiveness of neural networks in handling challenging environmental conditions, including illumination changes and dynamic objects, which traditional methods struggle to address. However, RGB images of scan data are distorted when SLAM is executed in a corridor environment. The limitations of this approach are its dependence on the plot's viewpoint and its sensitivity to option parameter settings. In the presence of unstable picture data obtained under varying environmental conditions, network optimization is challenging, and the efficacy of scan matching validation is affected adversely.

Furthermore, emerging graph neural network–based approaches⁴⁰ has shown promising results in addressing spatial consistency verification for SLAM applications, particularly in complex indoor environments where traditional feature-based methods often fail. To realize robust Graph SLAM, reliable preprocessing of scan data is imperative, even during operation in corridor environments.

To enhance the robustness of graph SLAM, we propose the VGM-Net to validate the scan matching results. Existing validation methods face significant challenges in corridor environments, particularly when dealing with feature–sparse conditions and measurement uncertainties. Traditional RGB-based approaches suffer from viewpoint dependency and parameter sensitivity, while conventional grid-based methods often lose critical geometric information through binary discretization. Our approach addresses these fundamental limitations through a novel probabilistic framework that preserves the inherent uncertainty characteristics of LiDAR measurements.

Unlike the previous method,³⁸ which converts LiDAR data into RGB images and suffers from viewpoint dependency in corridor environments, we introduce a novel VGM based on a Gaussian model that preserves the probabilistic nature of LiDAR measurements and spatial clustering information. Our VGM transforms raw 2D laser scan data into a grid-based Gaussian representation that effectively captures essential geometric features while reducing data dimensionality through a specialized convolutional layer for enhanced computational efficiency. The main contributions include (1) a novel Gaussian-based grid representation that handles LiDAR measurement uncertainty, (2) a CNN architecture designed for probabilistic grid data, and (3) experimental validation in challenging corridor environments. While this study focuses on indoor mobile robot applications, the proposed validation approach is applicable to autonomous driving systems, where accurate scan matching validation is equally critical for reliable localization and safe navigation in complex traffic environments.

The remainder of the study is structured as follows. In “Graph SLAM using VGM-Net” and “Validation scheme with VGM and VGM-NET” sections, we explain the application of VGM-Net to Graph SLAM and the proposed VGM-Net and Input data VGM. In “Experiments and results” section, the proposed validation system is compared to the existing validation system based on a neural network scheme by carrying out a series of experiments. Finally, we review the results and summarize them in “Conclusion” section.

Graph SLAM using VGM-Net

The graph SLAM without a validation method has the problem of inaccurately posing the graph. False pose graph hinders optimization in back-end. The scan matching validation method distinguishes between successes and failures in scan registration by comparing current scan data with previous data.

Failures of scan matching are caused by error function problems and local minima. Scan matching failures due to error functions result in no matching of points. In Xie et al.,²⁸ the robustness of SLAM was enhanced by determining scan matching failures caused by the error function problem. Scan matching failure by a local minimum represents a state where a subset of points is not matched. Local failures due to local minima lead to a situation in which the back-end is not optimized, and the mapping process fails with the accumulation of errors.

Figure 1 illustrates the comprehensive framework of the proposed VGM-Net validation system integrated with Graph SLAM. The framework operates through a sequential pipeline that begins with LiDAR sensor data acquisition, where consecutive scan frames are captured for processing. The raw scan data undergo scan matching to establish point correspondences between the previous and current frames, generating matched point pairs that serve as input for the subsequent validation process.

Figure 1.

Graph SLAM framework with VGM-Net.

The VGM generation stage transforms the matched point data into a VGM representation. Unlike conventional approaches, the VGM preserves the probabilistic characteristics of LiDAR measurements by encoding spatial clustering information and geometric features through Gaussian distributions. This probabilistic representation effectively captures the inherent uncertainty in sensor measurements, which is particularly crucial in feature–sparse corridor environments.

The generated VGM serves as input to VGM-Net, a specially designed CNN architecture that performs binary classification to determine scan matching validity. The VGM-Net leverages the probabilistic information encoded in the VGM to distinguish between successful and failed scan matching results with high accuracy. The network identifies not only complete matching failures but also detects subtle local failures that conventional validation methods often miss.

Based on the VGM-Net classification result, the framework follows two distinct paths. Valid scan matching results are integrated into the Graph SLAM frontend as reliable edges, contributing to robust pose graph construction and subsequent backend optimization. Conversely, invalid matches are immediately discarded, preventing the accumulation of erroneous constraints that would otherwise degrade mapping accuracy. This selective validation mechanism significantly enhances the overall robustness of the Graph SLAM system, particularly in challenging indoor environments where traditional scan matching methods frequently fail.

Validation scheme with VGM and VGM-NET

Validation grid map

In this section, we describe how to transform the continuous scan data into a Gaussian distribution in the grid. This transformation enables the preservation of spatial clustering and the inherent uncertainty of LiDAR measurements, providing a more informative representation than traditional occupancy models. The Occupancy Grid Map (OGM)⁴¹ defines each grid state that is calculated by scan data. Grid states are determined by threshold probability value.

In contrast, instead of using an occupancy threshold, our VGM approach differs from traditional methods through its probabilistic foundation. The advantage of VGM over OGM lies in its probabilistic representation of spatial uncertainty. While OGM uses binary or discrete probability values that may lose critical geometric information, VGM preserves the continuous distribution of LiDAR measurements. In corridor environments, scan matching failures often occur due to insufficient geometric features causing ambiguous correspondences, noise in LiDAR measurements creating uncertainty in point locations, and repetitive structures leading to multiple local minima.

The VGM addresses these issues by encoding measurement uncertainty through Gaussian parameters, preserving spatial clustering information that indicates feature density, and maintaining geometric relationships between consecutive scans. The mathematical distinction is significant. For a grid cell containing LiDAR points, OGM assigns a discrete probability based on the ratio of hits to total rays, which essentially quantizes spatial information. In contrast, VGM computes a continuous Gaussian distribution where the mean represents the central tendency of measurements, and the covariance captures the spatial spread. This Gaussian representation captures both the central tendency and spread of measurements, providing richer information for scan matching validation that better reflects the inherent uncertainty in sensor measurements.

A set of data points Z, such that $Z = {z_{1}, z_{2}, z_{3}, \dots, z_{n}}$ , represents the data obtained from LiDAR measurements. Light detection and ranging data consist of 2D points, denoted $z_{k} = {x_{k}, y_{k}}$ , in the Cartesian coordinate system. 2D LiDAR data points are typically clustered around obstacles that are close to the sensor. Figure 2 presents an example of the characteristics of a 2D LiDAR sensor. Here, α is the resolution of the LiDAR sensor.

Figure 2.

Characteristics of a 2D LiDAR sensor.

Given that $α$ is the angular resolution of the LiDAR sensor, and the LiDAR beams measuring the left and right walls are denoted as $a_{1}$ , $b_{1}$ and $a_{2}$ , $b_{2}$ , respectively, we calculate $d_{1}$ , $d_{2}$ using the law of cosines as expressed in equation (1):

d^{2} = a^{2} + b^{2} - 2 a b \cos α

(1)

Given that α is constant, $a_{1}$ and $b_{1}$ are smaller than $a_{2}$ and $b_{2}$ , respectively, $d_{1}$ is smaller than $d_{2}$ . Therefore, the LiDAR scan points are distributed densely on the left wall. We used the property of 2D LiDAR points clustering on nearby objects to generate VGM.

Let $M \in R$ denote the grid map where W and H represent the grid dimensions. The VGM generation process systematically transforms continuous LiDAR data into a discrete probabilistic representation through spatial discretization and statistical parameter estimation. We discretize the continuous LiDAR data $z_{k} = (x_{k}, y_{k})$ from continuous space to grid space through the mapping function $(i, j) = (⌊ x_{k} / r ⌋, ⌊ y_{k} / r ⌋),$ where r denotes the grid resolution and $| \cdot |$ represents the floor operation. This discretization partitions the workspace into uniform cells, enabling efficient spatial indexing of measurement data.

Subsequently, we perform point clustering by grouping all LiDAR points that fall within each grid cell. The clustering operation is defined as $C (m_{i}) = {z_{k} \in Z | ⌊ x_{k} / r ⌋ = i,$ $⌊ y_{k} / r ⌋ = j}$ , where $C (m_{i})$ represents the set of points clustered within grid cell $m_{i}$ , and $k = | C (m_{i}) |$ denotes the cardinality of this set. For each grid cell containing $k > 0$ clustered points, we compute the Gaussian distribution parameters through standard maximum likelihood estimation. The mean vector is calculated as equation (2):

μ_{i} = (\frac{1}{k}) \sum_{z_{k} \in C (m_{i})} z_{k}

(2)

The covariance matrix, which captures the spatial dispersion of measurements within the cell, is computed as equation (3):

\sum_{i} = (\frac{1}{k}) \sum_{z_{k} \in C (m_{i})} (z_{k} - μ_{i}) (z_{k} - μ_{i})^{T}

(3)

These parameters fully specify the Gaussian distribution for each grid cell. In equation (4), $m_{i}$ represents the i-th grid cell, $Z_{1, 2, \dots, k}$ denotes the set of k clustered LiDAR points within that cell, and $N (\cdot | μ_{i}, Σ_{i})$ is the Gaussian distribution characterized by mean $μ_{i}$ and covariance $Σ_{i}$ derived from equations (2) and (3):

m_{i} (Z_{1, 2, \dots, k}) \sim N (z | μ_{i}, \sum_{i})

(4)

p (m_{i} | z_{k}) = {\begin{matrix} N (x_{c} | μ_{i}, \sum_{i}) \\ 0 \end{matrix}

(5)

To assign a probability value to each grid cell, we evaluate the Gaussian distribution at the geometric center of the cell. The center point of grid cell $(i, j)$ is defined as $x_{c} = (i \cdot r + r / 2, j \cdot r + r / 2)$ . As shown in equation (5), The probability density $N (x_{c} | μ_{i}, Σ_{i})$ represents the multivariate Gaussian probability density function evaluated at the grid center $x_{c}$ , which is computed as the normalized exponential of the negative Mahalanobis distance between $x_{c}$ and the mean $μ_{i}$ , scaled by the inverse covariance matrix $Σ_{i}^{- 1}$ . The normalization constant is determined by the determinant of the covariance matrix to ensure the probability density integrates to unity over the entire space. For empty cells where no LiDAR points are clustered $k = 0$ , the probability is assigned as zero to indicate the absence of measurements in that spatial region.

Figure 3(a) shows the initialization of the grid cells. 2D LiDAR data are black dots, and G is VGM grids that have empty values. The 2D LiDAR data are discretized to cluster the LiDAR data on a grid in the grid cells. For scan data within a grid, calculate the mean and covariance as shown in equations (2) and (3). Subsequently, each grid will have a Gaussian distribution. The position value is sampled from the center of the corresponding grid from the Gaussian distribution and calculated as shown in Figure 3(b). In Figure 3(b), we represent the low probability areas with light gray and the high probability areas with black. On the other hand, G8 has a higher probability calculated and black colored because the average value is closer to the center.

Figure 3.

An example of how the grid becomes stochastic values. (a) 2D LiDAR data and empty VGM, (b) Calculated VGM.

We exploit the characteristic property of 2D LiDAR measurements, wherein scan points cluster densely around nearby obstacles, to generate the VGM representation. To emphasize grid cells with higher measurement density, we introduce a normalized hit ratio that quantifies the relative point concentration within each cell. The hit ratio for grid cell $m_{i}$ is defined as equation (6):

h_{i} = k_{i} / max_{j} (k_{j})

(6)

where

k_{i} = | C (m_{i}) |

represents the number of points clustered in cell

m_{i}

, and

\max_{j} (k_{j})

denotes the maximum point count across all grid cells. This normalization ensures that

h_{i} \in [0, 1]

, with cells containing the highest point density receiving a hit ratio of 1, while cells with fewer measurements receive proportionally lower values. The hit ratio serves as an attention mechanism, directing the validation process to focus on grid cells with substantial measurement information.

The final input data used in VGM-Net is calculated through the weighted combination of probability and hit ratio as shown in equation (7):

I (i, j) = h_{i} \times p (m_{i} | z_{k})

(7)

The VGM input I for VGM-Net is thus calculated as the element-wise product of the hit ratio $h_{i}$ and the grid probability $p (m_{i} | z_{k})$ . This formulation ensures that grid cells with both high measurement density and high Gaussian probability contribute more significantly to the scan matching validation process, while regions with sparse or absent measurements are appropriately suppressed.

Validation Gaussian grid map-network

The design of VGM-Net addresses fundamental limitations in existing validation approaches through systematic architectural considerations. Traditional RGB-based methods³⁸ suffer from viewpoint dependency and lose critical geometric information during LiDAR-to-image conversion. Geometric verification methods rely on fixed thresholds that perform poorly in feature–sparse environments. Our design process began by identifying the core requirements: preserving probabilistic information, maintaining computational efficiency, and capturing spatial correlation patterns.

The transition from VGM representation to neural network processing requires careful consideration of the data structure and computational efficiency. The VGM provides a two-dimensional grid where each cell contains Gaussian parameters along with hit ratio information. The key insight driving our network design is that successful scan matching should exhibit consistent spatial patterns in the VGM representation, which can be learned through deep feature extraction.

When scan matching succeeds, the Gaussian peaks from previous and current scans align spatially, creating coherent patterns across the grid. Conversely, when scan matching fails, the Gaussian distributions show significant spatial displacement or exhibit low correlation between consecutive frames. This observation led us to design a network that could effectively distinguish these spatial correlation patterns.

The choice of a 1 × 1 convolution layer as the initial processing step is theoretically motivated. This layer operates on each grid cell independently, computing a weighted combination of the Gaussian distributions from previous and current scan data. This operation is mathematically equivalent to creating a Gaussian Mixture Model at each spatial location, where the learned weights capture the relative importance of historical versus current measurements. The 1 × 1 convolution preserves the spatial structure of the VGM while reducing computational complexity by processing each grid cell independently without considering neighboring spatial relationships, which is appropriate for our probabilistic representation where each cell contains complete Gaussian information.

Figure 4 depicts the architecture of VGM-Net. The VGM serves as the input data for the CNN model, while the processed features from the 1 × 1 convolutional layer are subsequently passed through the network. This architecture ensures computational efficiency comparable to conventional scan matching algorithms while achieving improved feature discrimination for robust validation of scan matching results.

Figure 4.

Architecture of VGM-Net.

According to equation (8), the VGM is computed as it passes through the 1 × 1 convolutional layer:

p (M^{1}, M^{2}) = w_{1} N (x {| M^{1}, \sum}^{1}) + w_{2} N (x {| M^{2}, \sum}^{2})

(8)

The parameter weights w in the 1 × 1 convolutional layer yield results similar to those of a Gaussian mixture model because the VGM represents scan data as a Gaussian model. The main difference between equation (8) and a Gaussian mixture model is that the summation of weights does not necessarily add up to 1. The use of a 1 × 1 convolutional layer reduces the two-dimensional space of VGM to a one-dimensional representation. In successful scan matching cases, the values in equation (8) are smaller than in failed scan matching cases because scan data are dispersed over a wide area. The performance of scan matching validation is influenced by variations in data distribution during data transfer through the network.

Experiments and results

Experimental setup

To evaluate the performance of our proposed method, we compare the results with the method in Jeong and Lee³⁸ using the Intel dataset⁴² and data from a practical office environment. The selection of³⁸ as the primary baseline is carefully justified by several methodological considerations. While recent SLAM research has explored various approaches including scan-matching optimization methods^11,25,43 and robust backend optimization techniques,⁴⁴ these methods address fundamentally different aspects of the SLAM pipeline and do not provide comparable validation frameworks for direct performance comparison.

Our method is specifically compared against³⁸ because it represents the most relevant baseline for neural network–based scan matching validation in graph SLAM. To the best of our knowledge, Method³⁸ represents the most methodologically relevant existing approach that employs neural network–based validation for scan matching results in graph SLAM construction, providing the most appropriate baseline for evaluating deep learning–based front-end validation performance. Furthermore,³⁸ employs comparable edge construction strategies based on scan-matching results and operates under similar computational constraints, ensuring fair comparison. Alternative approaches such as switchable constraints,²⁶ max-mixture models,²⁷ and graph-cut RANSAC³⁰ fundamentally differ in their validation philosophy—they identify outliers during backend optimization rather than performing front-end validation, making direct accuracy comparison inappropriate. The Intel dataset^12,45 serves as a widely used benchmark that ensures consistent comparison conditions, while our practical office environment data demonstrate real-world applicability across diverse indoor scenarios with varying geometric complexity.

To obtain sensor data for a typical office space, we collected measurements from multiple locations as shown in Figure 5. In a practical office space with a corridor, we obtained data while navigating the mobile robot.

Figure 5.

A practical office environment with hallways for obtaining training data and performance verification.

Figure 6 depicts a reference map of the experimental location. The experimental area measured 68.0 m in length and 14.0 m in width. The results obtained using the proposed method was compared to those reported in He et al.³⁷ to ensure the accuracy of the proposed method. Subsequently, the VGM-Net method was implemented for validating scan matching to demonstrate the robustness of graph SLAM. The LiDAR data of the experimental area were acquired using an NAV350, a 2D LiDAR sensor designed for indoor use. This sensor has a wide aperture angle of 360°, which allows for broad coverage. Moreover, the sensor has a high resolution of 0.25°, meaning that it is capable of precise detection and measurement. Furthermore, the maximum measurable distance of this sensor is 250 m, which is adequate for extensive data acquisition.

Figure 6.

Reference map of performance verification environment, including hallways.

The NAV350 LiDAR sensor was installed on a mobile robot (RB1), and laser scan data of the environments were collected, as illustrated in Figure 7. Figure 7(a) shows the NAV350 LiDAR sensor mounted on RB1, and Figure 7(b) presents the scan data obtained from the NAV350 LiDAR sensor during the operation of the mobile robot. The network was trained on a PC equipped with an i9–13900KF CPU and a GeForce 4090 GPU.

Figure 7.

Measuring experimental data with NAV350. (a) A robot equipped with the NAV350, (b) The example of measuring data in an experimental environment with the NAV350.

Results of the VGM

Light detection and ranging scan data were acquired by operating the mobile robot in the selected indoor environment. The scan matching procedure was executed using the measured scan data, resulting in a total of 1,941 scan matching results.²² The scan matching results were transformed using the VGM. The dimensions of the VGM were established as 400 × 400 scales, while its resolution was set to 0.05 m.

The detailed parameters of VGM-Net are shown in Table 1. The names of the layers are represented as in the architecture in Figure 4. The detailed parameters of the network layers were set to 1 × 1 convolutional layer with 16 filters, convolutional layer 1 with 32 filters, convolutional layer 2 with 64 filters, and finally convolutional layers 4 and 5 with 128 filters each. The last multilayer perceptron in Figure 4 used 256 weights. In addition, the sigmoid activation function was exploited and the network was optimized employing the Adam optimizer. Training was performed with a batch size of 128 over 90 epochs.

Table 1.

Detailed parameters of VGB-Net.

Layer	Stride	Filters	Output shape
Conv(1 × 1)	1	1 × 1	16 × 400 × 400
Conv1	1	3 × 3	32 × 200 × 200
Conv2	1	3 × 3	32 × 100 × 100
Conv3	1	3 × 3	64 × 50 × 50
Conv4	1	3 × 3	128 × 25 × 25
Conv5	1	3 × 3	128 × 12 × 12

Figure 8 and Figure 9 demonstrate the effectiveness of the proposed scan matching method. The actual experimental environments are shown in Figures 8(a) and 9(a), while the corresponding scan matching results are presented in Figures 8(b) and 9(b). In these figures, black dots represent registration point data, and red dots indicate the transformed point data after applying the matching algorithm.

Figure 8.

Scan matching result and VGM image 1. (a) Real experimental data, (b) Scan matching result, and (c and d) VGM images.

Figure 9.

Scan matching result and VGM image 2. (a) Real experimental data, (b) Scan matching result, and (c and d) GM images.

The VGM-transformed results, displayed in Figures 8(c), (d) and 9(c), (d), visualize the VGM generation process by overlaying previous scan data with current scan data. This representation clearly shows the spatial alignment achieved through the proposed method.

Comparison of accuracy and precision

To evaluate the accuracy and precision of scan matching results validation, we selected 410 loop closing edge candidates from 1,301 scan data frames measured in an office LiDAR data. We chose the 254 loop closing edge candidates from 3,900 scan data frames in the Intel data set.

The comparison between the results method³⁸ and the proposed VGM-Net in office LiDAR dataset is shown in Table 2. The VGM-Net demonstrates superior performance with 92.34% accuracy compared to 85.32% for the baseline method, representing a significant 7% improvement in challenging corridor environments.

Table 2.

Comparison of accuracy and precision in practical office environment.

Methods	Accuracy	Precision
Jeong et al.³⁸	85.32%	97.1%
VGM-Net	92.34%	98.5%

The performance stems from VGM-Net enhanced representation capabilities. Table 3 reveals the underlying mechanisms behind this improvement. The reduction in false negatives from 50 to 10 cases indicates that VGM-Net successfully preserves valid geometric alignments that are lost in RGB-based approaches. This occurs because the Gaussian representation maintains continuous spatial relationships, whereas RGB conversion introduces discretization artifacts that cause valid matches to be incorrectly rejected.

Table 3.

Detailed comparison results in practical office environment.

Methods	TP	FN	FP	TN
Jeong et al.³⁸	347	50	10	3
VGM-Net	347	10	5	48

The reduction in false positives from 10 to 5 cases demonstrates VGM-Net's superior discrimination capability. Analysis of these cases reveals that 80% of the baseline method false positives occurred in repetitive corridor structures where geometric patterns appear similar but are spatially distinct. The VGM-Net probabilistic clustering effectively distinguishes between these ambiguous cases by preserving the continuous distribution of measurement uncertainty.

The false negative reduction is particularly critical for graph SLAM performance. False negatives represent the removal of valid edge connections, which destabilizes front-end connectivity and leads to optimization failure. The VGM-Net's significant reduction in this error type (from 50 to 10) directly translates to more robust pose graph construction and improved mapping accuracy, as demonstrated in the subsequent mapping experiments.

While both methods achieve identical true positive detection (347 cases), VGM-Net's lower false negative rate indicates superior sensitivity to valid geometric alignments in feature–sparse environments. The slight reduction in true negatives reflects VGM-Net's more conservative approach to edge rejection, prioritizing connectivity preservation over aggressive outlier removal.

The comparison between the results method³⁸ and the proposed VGM-Net in the Intel dataset is shown in Table 4. The VGM-Net achieves higher performance with 80.71% accuracy compared to 68.77% for the baseline method, representing a substantial 12% improvement that exceeds the office environment results.

Table 4.

Comparison of accuracy and precision in intel dataset.

Methods	Accuracy	Precision
Jeong et al.³⁸	68.77%	77.11%
VGM-Net	80.71%	83.19%

The Intel dataset presents more challenging conditions compared to practical office environments, including longer trajectory sequences, larger loop closures, and higher cumulative odometry drift. These characteristics make scan matching validation particularly difficult, as geometric patterns become increasingly ambiguous over extended sequences.

Table 5 shows the detailed breakdown of 254 edges, where VGM-Net demonstrates exceptional improvement in false positive reduction from 33 to 9 cases. This 73% reduction is particularly significant given the Intel dataset propensity for false loop closures due to repetitive indoor structures and accumulated drift. Method³⁸ loses critical geometric nuances in these challenging long-term scenarios, while VGM-Net continuous Gaussian representation preserves essential spatial relationships even under drift conditions.

Table 5.

Detailed comparison results in intel dataset.

Methods	TP	FN	FP	TN
Jeong et al.³⁸	155	46	33	20
VGM-Net	198	40	9	7

The substantial increase in true positive detection from 155 to 198 cases (28% improvement) reflects VGM-Net effectively identifies valid loop closures in the Intel dataset complex environment. This dataset contains numerous legitimate loop closure opportunities that are missed by method³⁸ due to accumulated geometric distortions, but VGM-Net probabilistic framework successfully captures these valid alignments by modeling measurement uncertainty inherent in long-term navigation.

The false negative reduction from 46 to 40 cases, while more modest, is crucial for maintaining graph connectivity in large-scale environments. In datasets like Intel with extensive trajectories, each valid edge connection is essential for global consistency, making VGM-Net conservative but accurate validation approach particularly valuable for robust large-scale SLAM performance.

Comparison of robustness in graph SLAM using VGM-Net

To evaluate the performance of Graph SLAM using VGM-Net, we conducted scan matching with LiDAR data from the experimental region and compared these results to Jeong and Lee.³⁸ We set up 394 nodes and obtained 1,132 edges between the nodes. When VGM-Net was used to validate scan matchings, 251 scan matchings were considered failures, and the corresponding edges were deleted. In contrast, Jeong and Lee³⁸ classified 302 scan matching results as failures and deleted the corresponding edges. Subsequently, the pose graph was computed utilizing the optimization method described in reference,¹² and the results are presented in Figure 10.

Figure 10.

Edge validation results with scan matching. (a) Edges validated using the approach proposed by Jeong and Lee,³⁸ and (b) Edges validated using VGM-Net.

The map depicted in Figure 11 was generated by implementing the OGM method.⁴¹ When dynamic obstacles are present in the environment, dynamic obstacles move from frame to frame, so it is possible to efficiently handle dynamic obstacles to obtain accurate maps, given that obstacles are not registered to the same area within the grid map repeatedly. Figure 11 presents a comparison of the application of graph SLAM to scan matching validation. Figure 11(a) shows the results of graph SLAM utilizing the method proposed in reference,³⁸ while Figure 11(b) shows the result of Graph SLAM by applying the proposed VGM-Net approach.

Figure 11.

Comparison of results of graph SLAM using scan matching validation in our obtained LiDAR data. (a) Graph SLAM results obtained by Jeong and Lee³⁸ and (b) Graph SLAM results obtained using VGM-Net.

In this representation, the gray areas represent unknown regions, white areas represent occupied spaces, and black areas represent unoccupied spaces. In comparison to Figure 6, it is evident that Figure 11(a) exhibits a horizontal error of 0.47 m and vertical error of 0.81 m, whereas Figure 11(b) exhibits a horizontal error of 0.6 m and vertical error of 0.23 m.

Figure 10 shows the edge validation experiments with scan matching. Figure 10(b) shows the 810 edges remaining, and Figure 10(a) shows the 830 edges remaining. Because the approach proposed by Jeong et al. removed too many edges compared to VGM-Net, the resulting map was unstable after optimization, as shown in Figure 11.

Figure 12 shows the edges that were deleted after validating the scan matching results, as well as an enlarged view of the graph SLAM result. Compared to VGM-Net, the validation approach of Jeong and Lee³⁸ deleted too many edges, resulting in a blurred map in (c) compared to that in (d). However, VGM-Net, with proper validation, yielded obvious results, such as those in (d). The red line shows that 45 edges were eliminated out of the 118 edges in this area (a), and 45 edges were deleted following validation in (b). Although (a) deleted more edges than (b), it deleted edges with true values as well. In other words, the number of FP in (a) was higher, and the back-end could not identify the optimal pose as a consequence. Using the endpoints of maps (b) and (d) to evaluate map accuracy, it was found that (b) had a 0.47 m map error relative to (d).

Figure 12.

Deleted edges and enlarged view of results in the lounge. (a) Edges deleted using the validation approach of Jeong et al.,³⁸ (b) Enlarged results of Graph SLAM obtained using the approach of Jeong et al.,³⁸ (c) Edges deleted using VGM-Net validation, (d) Enlarged results of Graph SLAM using VGM-Net.

Figure 13 presents a comparison of the application of graph SLAM to validation system in Intel datasets. The blue dots represent vertices and the red lines indicate edges. In the Intel dataset, 1,402 loop edges were created. Method³⁸ rejects 1,268 loop edges while VGM-Net rejects 1,338 loop edges.

Figure 13.

Comparison of results of graph SLAM using validation system in intel dataset. (a) Graph SLAM results obtained by Jeong and Lee³⁸ and (b) Graph SLAM results obtained using VGM-Net.

The areas marked with red circles in Figure 13 indicate regions where the proposed VGM-Net method achieves significantly better performance than the baseline method.³⁸ In these highlighted regions, VGM-Net successfully eliminates false loop closures that were incorrectly accepted by the conventional approach, resulting in a more consistent and accurate map structure.

Figure 14 shows enlarged views of the circled regions from Figure 13, providing a detailed comparison of rejection performance in corridor environments. The conventional method³⁸ fails to properly reject false edges, leading to map distortion and structural inconsistencies as shown in Figure 14(a). In contrast, VGM-Net demonstrates accurate rejection of false edges, preventing map distortion and maintaining geometric consistency as illustrated in Figure 14(b).

Figure 14.

Comparison of rejection edges using validation system. (a) Validation system,³⁸ (b) VGM-Net validation system.

By rejecting false edges more effectively, the backend performance is improved compared to the conventional method.³⁸ By calculating the error using the wall thickness, Figure 14(a) shows a horizontal error of 1.01 m, while Figure 14(b) shows 0.48 m, demonstrating that our method resulted in a more accurate map with reduced distortion.

In this paper, we carried out SLAM experiments in various environments including the typical office environments and Intel data set to verify the proposed method. Unlike the method of applying the robust kernel for optimization of the backend,⁴⁴ we not only detect false positive loop closure constraints²⁶ but also detect false positive edges for all node connectivity. To train VGM-Net, we created the VGM using the Gaussian model as the input data. The proposed VGM maintains that the property of 2D LiDAR points clustering on nearby objects is because of the hit ratio and mean values in the grid. The VGM-Net has more rejection accuracy of false edges. A high rejection accuracy of false edges leads to a more robust mapping.

While this study focuses on direct comparison with method³⁸ as the most methodologically relevant baseline, it is essential to contextualize the validation approach within the broader SLAM outlier detection landscape. Alternative robust optimization techniques such as switchable constraints,²⁶ max-mixture models,²⁷ and robust backend kernels⁴⁴ operate during pose graph optimization rather than performing front-end validation, representing fundamentally different validation philosophies that make direct performance comparison methodologically inappropriate. These backend approaches assume that outlier constraints have already entered the pose graph and focus on minimizing their influence during optimization, whereas VGM-Net performs preemptive validation to prevent erroneous edges from corrupting the graph structure. The experimental validation across benchmark and real-world datasets, coupled with detailed performance analysis demonstrating superior false positive rejection, establishes the effectiveness of probabilistic front-end validation for robust graph SLAM in challenging indoor environments.

Conclusion

This study proposed the VGM-Net to enhance the robustness of Graph SLAM through accurate validation of scan matching results using a novel Gaussian-based grid representation of 2D LiDAR data. The VGM effectively preserves spatial clustering characteristics and geometric features without distortion, enabling VGM-Net to distinguish between valid and erroneous scan matches by focusing on high-value grid cells. Experimental results on both Intel datasets and real indoor office environments demonstrate that the proposed scheme successfully rejects incorrect edges, leading to improved pose graph convergence and more accurate mapping. However, the current approach has limitations that warrant consideration. The reliance on LiDAR data alone poses challenges in degenerate environments where geometric features are extremely sparse or repetitive, such as long corridors or symmetric structures, where the scan matching stage may produce false positive results that appear geometrically plausible. The method also faces constraints in large-scale environments, where accumulated drift degrades performance, and dynamic scenarios, where moving objects compromise validation accuracy. Computational costs scale with grid resolution, potentially limiting deployment on resource-constrained platforms. Future research will focus on developing a multimodal validation system addressing these limitations by incorporating camera-LiDAR fusion to provide complementary geometric and visual constraints that effectively identify and reject false matches in challenging scenarios, while hierarchical representations and temporal consistency mechanisms will enhance scalability and robustness in large-scale dynamic environments.

Footnotes

ORCID iDs

Jun Hyeong Jo

Chang-bae Moon

Ethical statement

This research is not related to medical research.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data availability statement

All data relevant to this study are available upon request. Please contact the corresponding author to access the data.

References

Liang

Yang

Tan

, et al. Enhancing high-speed cruising performance of autonomous vehicles through integrated deep reinforcement learning framework. IEEE Trans Intell Transport Syst 2025; 26: 835–848.

Kim

Park

Kim

. 1-day learning, 1-year localization: long-term LiDAR localization using scan context image. IEEE Robot Autom Lett 2019; 4: 1948–1955.

Wang

Zhang

. M2DP: a novel 3D point cloud descriptor and its application in loop closure detection. In: 2016 IEEE/RSJ international conference on intelligent robots and systems (IROS). Daejeon, South Korea: IEEE, 2016, pp. 231–237.

Chen

Dong

Peng

, et al. Continuous occupancy mapping in dynamic environments using particles. IEEE Trans Robot 2024; 40: 64–84.

Kim

Y-J

Moon

. Human target tracking using a 3D Laser range finder based on SJPDAF by filtering the Laser scanned point clouds. Int J Control Autom Syst 2020; 18: 1561–1571.

Murphy

Russell

. Rao-Blackwellised particle filtering for dynamic Bayesian networks. In: Doucet

Freitas

Gordon

(eds) Sequential Monte Carlo methods in practice. New York, NY: Springer New York, 2001, pp.499–515.

Montemerlo

Thrun

Koller

, et al. FastSLAM: a factored solution to the simultaneous localization and mapping problem. In: Eighteenth national conference on artificial intelligence. USA: American Association for Artificial Intelligence, 2002, pp.593–598.

Smith

Self

Cheeseman

. Estimating uncertain spatial relationships in robotics. In: Proceedings. 1987 IEEE international conference on robotics and automation. Raleigh, NC, USA: Institute of Electrical and Electronics Engineers, 1987, pp.850–850.

Leonard

Durrant-Whyte

Cox

. Dynamic map building for autonomous mobile robot. In: IEEE International workshop on intelligent robots and systems, towards a new frontier of applications. Ibaraki, Japan: IEEE, 1990, pp.89–96.

10.

Murphy

. Bayesian Map learning in dynamic environments. In: Proceedings of the 13th international conference on neural information processing systems, in NIPS’99. Cambridge, MA, USA: MIT Press, 1999, pp.1015–1021.

11.

Rui

Feng

. FAST-LiDAR-SLAM: a robust and real-time factor graph for urban scenarios with unstable GPS signals. IEEE Trans Intell Transport Syst 2024; 25: 20043–20058.

12.

Grisetti

Kümmerle

Stachniss

, et al. A tutorial on graph-based SLAM. IEEE Intell Transp Syst Mag 2010; 2: 31–43.

13.

Doherty

Rosen

Leonard

. Performance guarantees for spectral initialization in rotation averaging and pose-graph SLAM. In: 2022 International conference on robotics and automation (ICRA). Philadelphia, PA, USA: IEEE, 2022, pp.5608–5614.

14.

Zhou

Liu

, et al. Efficient 2D graph SLAM for sparse sensing. In: 2022 IEEE/RSJ international conference on intelligent robots and systems (IROS). Kyoto, Japan: IEEE, 2022, pp.6404–6411.

15.

Shan

Englot

Meyers

, et al. LIO-SAM: tightly-coupled lidar inertial odometry via smoothing and mapping. 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2020: 5135–5142.

16.

Shan

Englot

Ratti

, et al. LVI-SAM: tightly-coupled lidar-visual-inertial odometry via smoothing and mapping. In: 2021 IEEE international conference on robotics and automation (ICRA). Xi’an, China: IEEE, 2021, pp.5692–5698.

17.

Biber

Strasser

. The normal distributions transform: a new approach to laser scan matching. In: Proceedings 2003 IEEE/RSJ international conference on intelligent robots and systems (IROS 2003) (cat. No.03CH37453). Las Vegas, Nevada, USA: IEEE, 2003, pp.2743–2748.

18.

Kung

P-C

Wang

C-C

Lin

W-C

. A normal distribution transform-based radar odometry designed for scanning and automotive radars. In: 2021 IEEE international conference on robotics and automation (ICRA). Xi’an, China: IEEE, 2021, pp.14417–14423.

19.

Besl

McKay

. A method for registration of 3-D shapes. IEEE Trans Pattern Anal Mach Intell 1992; 14: 239–256.

20.

Censi

. An ICP variant using a point-to-line metric. In: 2008 IEEE International Conference on Robotics and Automation 2008: 19–25.

21.

Pavlov

Wv. Ovchinnikov

Yu. Derbyshev

, et al. AA-ICP: iterative closest point with Anderson acceleration. In: 2018 IEEE international conference on robotics and automation (ICRA). Brisbane, QLD: IEEE, 2018, pp.3407–3412.

22.

Moon

. Development of a practical ICP outlier rejection scheme for graph-based SLAM using a Laser range finder. Int. J Precis Eng Manuf 2019; 20: 1735–1745.

23.

Segal

Haehnel

Thrun

. Generalized-ICP . Seattle, WA: Robotics: Science and Systems Foundation, 2009, 435.

24.

Chetverikov

Svirko

Stepanov

, et al. The trimmed iterative closest point algorithm. In: Object recognition supported by user interaction for service robots. Quebec City, Canada: IEEE Computer Society, 2002, pp.545–548.

25.

Vizzo

Guadagnino

Mersch

, et al. KISS-ICP: in defense of point-to-point ICP – simple, accurate, and robust registration if done the right way. IEEE Robot Autom Lett 2023; 8: 1029–1036.

26.

Sünderhauf

Protzel

. Switchable constraints for robust pose graph SLAM. 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems 2012: 1879–1884.

27.

Olson

Agarwal

. Inference on networks of mixtures for robust robot mapping. Int J Rob Res 2013; 32: 826–840.

28.

Xie

Wang

Markham

, et al. Graphtinker: outlier rejection and inlier injection for pose graph SLAM. In: 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS). Vancouver, BC: IEEE, 2017, pp.6777–6784.

29.

Yang

Zhu

Nian

, et al. A robust pose graph approach for city scale LiDAR mapping. 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2018: 1175–1182.

30.

Raza

Awais Rehman

Qureshi

, et al. A simplified approach to visual odometry using graph cut RANSAC. In: 2018 5th international multi-topic ICT conference (IMTIC). Jamshoro: IEEE, 2018, pp.1–7.

31.

Censi

. An accurate closed-form estimate of ICP’s covariance. in Proceedings 2007 IEEE International Conference on Robotics and Automation 2007: 3167–3172.

32.

Wang

, et al. Evaluation of the ICP algorithm in 3D point cloud registration. IEEE Access 2020; 8: 68030–68048.

33.

Zou

Wan

, et al. A survey of LiDAR-based 3D SLAM in indoor degraded scenarios. In: 2024 6th international conference on frontier technologies of information and computer (ICFTIC). Qingdao, China: IEEE, 2024, pp.856–861.

34.

Valente

Joly

De La Fortelle

. An LSTM network for real-time odometry estimation. In: 2019 IEEE intelligent vehicles symposium (IV). Paris, France: IEEE, 2019, pp.1434–1440.

35.

Spampinato

Bruna

Guarneri

, et al. Deep learning localization with 2D range scanner. In: 2021 7th international conference on automation, robotics and applications (ICARA). Prague, Czech Republic: IEEE, 2021, pp.206–210.

36.

Zhan

Chen

, et al. Deep learning for 2D scan matching and loop closure. In: 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS). Vancouver, BC: IEEE, 2017, pp.763–768.

37.

Zhang

Ren

, et al. Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR). Las Vegas, NV, USA: IEEE, 2016, pp.770–778.

38.

Jeong

Lee

. CNN-based fault detection of scan matching for accurate SLAM in dynamic environments. Sensors 2023; 23: 2940.

39.

Memon

Wang

Hussain

. Loop closure detection using supervised and unsupervised deep neural networks for monocular SLAM systems. Rob Auton Syst 2020; 126: 103470.

40.

Liu

Zhang

Yan

, et al. RGB-D camera and graph neural network-based SLAM for dynamic and low-texture environments. Sci Rep 2025; 15: 28214.

41.

Moravec

Elfes

. High resolution maps from wide angle sonar. In: Proceedings. 1985 IEEE international conference on robotics and automation. St. Louis, MO, USA: Institute of Electrical and Electronics Engineers, 1985, pp.116–121.

42.

Pre-2014 Robotics 2D-Laser Datasets . Accessed 06 November 2025. [Online]. Available: https://www.ipb.uni-bonn.de/datasets/index.html.

43.

Wang

Y-T

Peng

C-C

Ravankar

, et al. A single LiDAR-based feature fusion indoor localization algorithm. Sensors 2018; 18: 1294.

44.

Chebrolu

Labe

Vysotska

, et al. Adaptive robust kernels for non-linear least squares problems. IEEE Robot Autom Lett 2021; 6: 2240–2247.

45.

Hahnel

Burgard

Fox

, et al. An efficient FastSLAM algorithm for generating maps of large-scale cyclic environments from raw laser range measurements. In: Proceedings 2003 IEEE/RSJ international conference on intelligent robots and systems (IROS 2003) (cat. No.03CH37453). Las Vegas, Nevada, USA: IEEE, 2003, pp.206–211.

Scan matching validation scheme based on the neural network using Gaussian model input to improve robustness of graph SLAM

Abstract

Keywords

Introduction

Graph SLAM using VGM-Net

Validation scheme with VGM and VGM-NET

Validation grid map

Validation Gaussian grid map-network

Experiments and results

Experimental setup

Results of the VGM

Comparison of accuracy and precision

Comparison of robustness in graph SLAM using VGM-Net

Conclusion

Footnotes

ORCID iDs

Ethical statement

Funding

Declaration of conflicting interests

Data availability statement

References