Abstract
Simultaneous Localization and Mapping (SLAM) systems typically rely on a prior map constructed during an initial deployment. In real-world environments, however, structural and semantic changes gradually invalidate this map, leading to degraded localization accuracy and, in severe cases, localization failure. This limitation poses a major obstacle to the long-term deployment of mobile robots in dynamic environments. This paper proposes a lifelong mapping framework with multi-view projection fusion (LLMF) that enables efficient map maintenance while preserving a consistent global coordinate system. The framework introduces two key design components. First, a multi-view point cloud projection alignment strategy based on Bird’s-Eye View (BEV) and frontal view (FV) projections is employed to align point cloud maps acquired at different times without re-labeling previously defined operational points. Second, an image-based change detection and map update mechanism is developed, transforming computationally expensive 3D point cloud comparisons into efficient 2D image processing operations. The proposed framework is evaluated through qualitative experiments on the open-source MulRan dataset and quantitative long-term experiments conducted over more than nine months in a real farm environment. Experimental results demonstrate that LLMF maintains localization accuracy while significantly reducing the computational cost of change detection, lowering processing time from several hours to a few minutes. These results indicate that the proposed framework provides a practical and scalable engineering solution for long-term robot operation in changing environments.
Keywords
Introduction
Simultaneous Localization and Mapping (SLAM) is a core enabling technology for autonomous mobile robots, enabling them to localize and navigate in unknown environments while incrementally constructing a spatial representation of the surroundings. SLAM techniques have been widely adopted in robotic autonomy and navigation systems, including multi-robot coordination and inspection tasks.1–3 In recent years, LiDAR–inertial odometry systems such as Fast-LIO 4 and Faster-LIO 5 have significantly improved real-time pose estimation accuracy and robustness, while publicly available datasets such as FusionPortable facilitate the development and evaluation of generalized SLAM frameworks. 6
With the continuous development of optimization and estimation methods, including metaheuristic algorithms such as the spiral dynamics algorithm and the harmony search family,7–9 modern SLAM systems have achieved increasingly accurate pose estimation and 3D mapping capabilities. These advances support a wide range of engineering applications such as inspection, monitoring, and autonomous operation.
In practical deployments, a common industrial paradigm is to treat the 3D map constructed during an initial operation as a fixed reference for subsequent tasks. Recent research has explored robust map alignment and lifelong mapping strategies to maintain localization accuracy over extended deployments, including open-set object map alignment approaches, 10 lifelong LiDAR–IMU SLAM frameworks with autonomous map updating, 11 and multi-session LiDAR mapping systems designed for long-term operation. 12
This paradigm is particularly prevalent in Light Detection and Ranging (LiDAR) robotic systems, where three-dimensional sensing enables accurate perception and mapping of the surrounding environment. Public datasets such as SemanticKITTI have facilitated the development and evaluation of LiDAR-based perception and mapping methods. 13 In such systems, localization is commonly performed using scan-to-map matching, where real-time LiDAR scans are aligned with a previously constructed global map to estimate the robot pose in three-dimensional space. LiDAR-based perception and mapping techniques have been widely applied in infrastructure inspection, object detection, and scene understanding tasks,14–16 while SLAM and odometry frameworks such as tracked robot odometry, active SLAM strategies, and gravity-constrained SLAM systems provide robust localization capabilities for mobile robots.17–19 While effective in static or slowly changing environments, this approach exhibits fundamental limitations when deployed over long time horizons in real-world settings.
Real environments are inherently dynamic. Object relocation, structural modifications, and seasonal or operational changes progressively invalidate the prior map, leading to increasing discrepancies between the reference map and the actual environment.20–22 As these discrepancies accumulate, localization accuracy degrades and may ultimately result in localization failure. A straightforward response is to reconstruct the map; however, this solution introduces a critical engineering drawback: the newly reconstructed map is generally inconsistent with the original coordinate system. As a consequence, previously defined semantic annotations, inspection points, and operational locations must be re-labeled, resulting in substantial time overhead and additional economic cost.23,24 These limitations significantly constrain the practical value of mobile robots in long-term and large-scale deployments.25–27 Addressing this challenge has given rise to the research problem commonly referred to as lifelong mapping–defined as the technical paradigm of continuously maintaining and adaptively updating maps over extended operational periods in dynamic environments.
In this work, we propose a lifelong mapping framework with multi-view projection fusion, termed LLMF, aimed at enabling efficient map maintenance while preserving a consistent global coordinate system. Rather than relying on direct and computationally expensive 3D point cloud comparisons, the proposed framework adopts a global-to-local strategy in which point clouds acquired at different times are projected into complementary two-dimensional views. Specifically, Bird’s-Eye View (BEV) and frontal view (FV) projections are jointly exploited to align point cloud maps across different operational periods without altering the original map coordinates. Environmental changes are then identified through image-based change detection, allowing map updates to be performed efficiently while avoiding the re-labeling of existing operational points.
The main contributions of this work are summarized as follows: A lifelong mapping framework is developed by extending traditional SLAM with a dedicated map maintenance mechanism that preserves global coordinate consistency across long-term deployments. A multi-view point cloud projection and alignment strategy is introduced, combining BEV and FV projections with image-based change detection to transform computationally intensive 3D operations into efficient 2D image processing. The proposed framework is validated through qualitative experiments on the open-source MulRan dataset
28
and long-term quantitative evaluations conducted over nine months in a real farm environment, comprising 32 operational runs, demonstrating both localization robustness and computational efficiency.
Related work
Long-term localization and mapping in changing environments has attracted significant attention from both industry and academia. Existing approaches can be broadly categorized according to their map representation, update strategy, and change detection mechanism.
Early industrial solutions have primarily focused on sub-map management and replacement strategies. For example, Gaussian Robotics proposed a long-term localization and mapping method for 2D environments in which the global map is decomposed into multiple sub-maps; when environmental changes are detected, outdated sub-maps are replaced while attempting to preserve associated information. 29 While effective in structured 2D settings, such approaches are difficult to extend to large-scale 3D environments.
In LiDAR-based 3D mapping, Kim et al. introduced Long-term Mapper (LT-Mapper), 30 this method first eliminates highly dynamic objects in the environment, 31 then employs Scan Context descriptors 32 to achieve trajectory alignment across multiple sessions, and performs point-wise change detection through kd-tree-based set difference operations. 33
Baidu Apollo proposed a joint vehicle localization framework 34 that combines LiDAR–inertial odometry with global matching and detects environmental changes based on grid-level statistics such as point density and height variance. Walcott et al. developed Dynamic Pose Graph SLAM (DPG-SLAM), 26 incorporating environmental dynamics directly into the pose graph formulation by selectively removing inactive scans and adding new observations. Other studies have explored point cloud registration and change detection by exploiting density variations in urban scenes 35 or by leveraging prior Building Information Modeling (BIM) data to align and update multi-session maps. 23 Lifelong Localization (LiLoc) 36 retains factor graph information from the prior map and distributes historical constraints during subsequent operations to support lifelong localization.
Compared with existing 3D lifelong localization and mapping methods, the proposed LLMF in this paper features essential innovations in core design, with its key differences and technical advantages specified as follows: (1) LT-Mapper adopts a combined strategy of “LiDAR keyframe matching + kd-tree 3D point cloud differencing.” Its change detection performance is fully dependent on high-precision localization and matching results, and the accumulation of matching errors is prone to causing map ghosting (i.e., the ghosting effect shown in Figure 1). Moreover, its local update mechanism undermines the consistency of the global coordinate system, leading to the invalidation of original semantic annotations; (2) BIM-SLAM assists multi-session map alignment through pre-defined Building Information Modeling (BIM). It cannot only adapt to dynamic scenarios without BIM support such as farmland and temporary work areas but also highly relies on offline models for computation, which severely limits the real-time applicability in engineering scenarios; (3) LiLoc solely focuses on localization consistency and achieves short-term high-precision localization by retaining historical constraints through factor graphs. However, it lacks an active map update mechanism, making it unable to cope with the long-term evolution of environmental structures and difficult to support the continuous operation of robots in dynamic scenarios. LLMF innovatively proposes a BEV+FV multi-view projection fusion strategy, transforming computationally intensive 3D point cloud processing into efficient 2D image processing, thus breaking through the efficiency bottleneck of traditional direct 3D point cloud processing. Meanwhile, relying on the fixed global coordinate constraint, it fundamentally solves the coordinate drift and consistency issues caused by local updates. Throughout the process, there is no need for re-annotating semantic information or relying on pre-defined models, making it more in line with the core requirements of practical engineering deployment for efficiency, stability, and versatility. The comparison of key characteristics of related methods is shown in Table 1.

In the farm environment, the LT-Mapper algorithm exhibited a map ghosting issue (The three red boxes in the figure indicate areas where map ghosting occurred after the map was updated).
Comparison of lifelong localization and mapping methods.
Despite these advances, two fundamental challenges remain in lifelong mapping for 3D robotic systems. First, accurately aligning the coordinate systems of point cloud maps acquired at different times remains non-trivial, particularly when large-scale environmental changes occur. Second, efficient and reliable change detection over large-scale 3D point clouds is computationally demanding. Most existing methods adopt a local-to-global update strategy, incrementally updating the global map through frame-level alignment or local region comparison. This strategy is sensitive to local matching errors and may lead to accumulated drift or inconsistencies in the global coordinate system. In practice, such errors can cause shifts in previously labeled operational points, as illustrated by the ghosting phenomenon observed in LT-Mapper updates (Figure 1).
An alternative strategy is to perform global point cloud comparison to avoid local alignment errors; however, direct 3D matching methods, such as Iterative Closest Point (ICP), 37 incur prohibitively high computational costs when applied to large-scale maps. Although grid-based partitioning and multi-threading techniques have been proposed to accelerate change detection,38,39 their computational demands remain incompatible with long-term, resource-constrained robotic operation.
To alleviate these limitations, several studies have explored the use of image-based alignment and change detection techniques. Classical feature-based methods, such as Scale-Invariant Feature Transform (SIFT), 40 Speeded Up Robust Features (SURF),41,42 and Oriented FAST and Rotated BRIEF (ORB), 43 have demonstrated robust performance in image matching and are widely adopted in visual SLAM systems.44,45 More recently, deep learning-based change detection models have achieved strong performance in remote sensing applications by identifying differences between temporally separated images, including methods based on multi-teacher knowledge distillation, 46 bi-temporal adapter networks, 47 and general feature interaction architectures. 48
Motivated by these developments, this work leverages image-based alignment and change detection techniques to address the computational and robustness limitations of 3D point cloud processing, extending their application to lifelong map maintenance in large-scale robotic environments.
This section presents the proposed lifelong mapping framework with multi-view projection fusion (LLMF). The overall architecture of the framework is illustrated in Figure 2. LLMF is designed as a modular system composed of four tightly integrated components: (i) a global localization and mapping module, (ii) a multi-view point cloud projection alignment module, (iii) an image-based point cloud change detection module, and (iv) a map update module. In the proposed framework, the global map constructed during the robot’s initial operation is treated as a persistent reference map for subsequent long-term deployment. During each new operation, the robot performs global localization and mapping using this prior map as a reference. Upon completion of the task, both the historical map and the newly constructed map are jointly processed by the multi-view projection module, where 3D point clouds are projected into complementary two-dimensional representations. These projected images are then passed to the change detection module, which identifies regions corresponding to environmental changes through image comparison. Based on the resulting change maps, the map update module selectively removes obsolete point cloud data and integrates newly observed structures into the prior map. Crucially, this update process preserves the original global coordinate system, ensuring that existing semantic annotations, inspection points, and operational locations remain valid. Through this pipeline, LLMF enables efficient and consistent long-term map maintenance in dynamic environments.

The system framework of LLMF.
Accurate estimation of the robot’s global pose during motion is a prerequisite for reliable localization and task execution in large-scale environments.49,50 Conventional global localization approaches typically rely on direct frame-to-map matching against a prior global map. In dynamic environments, however, this strategy becomes increasingly fragile, as discrepancies between the current scene and the reference map can lead to degraded accuracy or localization failure. In this framework, global localization and mapping are implemented using a robust hybrid strategy consistent with established LiDAR-based SLAM systems,19,51 which combines multiple complementary estimation stages. which combines multiple complementary estimation stages. First, scan-to-scan matching between consecutive LiDAR frames is performed to estimate the local motion of the robot. Second, scan-to-map matching is used to anchor the local estimates to the global reference map and recover the global pose. Finally, factor graph optimization integrates these constraints to produce a globally consistent and accurate pose estimate. Based on the optimized trajectory, a point cloud map of the current environment is incrementally constructed. This newly generated map serves as the input for subsequent multi-view projection, change detection, and map update processes described in the following subsections.
Point cloud multi-view projection alignment module
Let the global 3D point cloud map be denoted as the set
To establish a mapping between the 3D point cloud and a 2D image representation, a 3D bounding box is first defined using these extrema. The width
Given a grid resolution
To improve robustness against variations in scene scale, height-related features are normalized to the interval
Similarly, let
These normalized features are encoded into a color image using the HSV color space. Specifically, the hue channel is determined by
Using this formulation, Bird’s-Eye View (BEV) projections are generated by projecting onto the
Let the reference projection image and the current projection image be denoted by
Eliminating the scale factor yields the non-homogeneous form
ORB features are extracted from
To ensure no regional bias in sampling and the statistical consistency of RANSAC iterations, Algorithm 1 adopts an unbiased random sampling without replacement strategy from the set of valid correspondences filtered through ORB feature matching and NNDR thresholding. Considering that homography estimation (a
By applying this procedure independently to BEV and FV projections, two homography matrices,
In this section, we directly adopt the Multi-Teacher Knowledge Distillation (MTKD) framework for change detection proposed by Liu et al.,
46
and conduct model training using our self-collected dataset instead of the original datasets in the reference paper. Specifically, our dataset comprises 47,901 image pairs, which are derived from cropping the Bird’s-Eye View (BEV) images, with a unified input resolution of 512
The proposed framework employs a lightweight convolutional neural network
The network is trained using a weighted binary cross-entropy loss to address class imbalance:
During inference, the continuous probability map
To quantify the degree of scene change and guide network training, we adopt a change detection model based on Multi-Teacher Knowledge Distillation (MTKD) and define the Change Area Ratio (CAR) as
Based on
Given the binary change map
For each grid cell
Conversely, a 3D point
Using this mapping, the binary change map
Similarly, points from the current map that correspond to changed regions and should be added to the global map are defined as
The updated point cloud map is then obtained via set operations:
For bookkeeping and downstream processing, each point involved in the update is assigned a label:
A key constraint of the proposed update mechanism is that the global coordinate system remains unchanged throughout the process:
The computational complexity of the mapping update procedure is
To evaluate the effectiveness of the proposed lifelong mapping framework with multi-view projection fusion (LLMF), a comprehensive experimental study was conducted from three complementary perspectives. All experiments were performed on a workstation equipped with an Intel Core i7-6600U CPU and 16 GB RAM.
Two datasets were used for evaluation: the open-source MulRan dataset and a self-collected long-term farm dataset. The farm dataset was acquired using the inspection robot shown in Figure 3. The robot is equipped with a 16-line Robosense LiDAR and a 6-axis inertial measurement unit (IMU). Data collection was carried out over a period of nine months, resulting in 32 operational runs, with an average frequency of approximately one run per week.

The inspection robot used in the experiments, equipped with a 16-line Robosense LiDAR and a 6-axis IMU.
The experimental evaluation focuses on the following three aspects: qualitative assessment of multi-view projection-based coordinate system alignment to verify the effectiveness of global map unification across different operational periods; quantitative analysis of computational performance, including change detection runtime, memory consumption, and CPU usage; long-term evaluation of localization performance in real-world deployment scenarios.
To qualitatively verify the effectiveness of the proposed projection-based map alignment strategy, two representative inspection locations were selected from the point cloud maps obtained during different runs of both the MulRan and the farm datasets. For each dataset, the spatial coordinates of two inspection point vectors, denoted as
Specifically, for the MulRan dataset,
The qualitative alignment results are illustrated in Figure 4. Figure 4(A) to (C) correspond to the MulRan dataset, showing the point cloud maps before and after alignment and the comparison of inspection point vectors. Figure 4(D) to (F) present the corresponding results for the farm dataset. Quantitative results, including coordinate deviations and projection alignment time, are reported in Table 2. Together, these results demonstrate that the proposed alignment strategy achieves accurate global coordinate unification while maintaining low computational overhead, which is essential for long-term map maintenance.

Qualitative results of coordinate system alignment. (A,B) and (D,E) show point cloud maps of the MulRan and farm datasets before and after alignment, respectively. (C) and (F) illustrate the spatial correspondence of inspection points after alignment.
MAP alignment based on multiple views projection.
This subsection presents a quantitative evaluation of the proposed image-based change detection module, with a particular focus on computational efficiency. Experiments were conducted on both the long-term farm dataset and the open-source MulRan dataset.
For the farm dataset, the prior point cloud map
Table 3 reports a comparative evaluation of three change detection approaches: a point cloud matching-based method, 37 a grid division-based method, 38 and the proposed method. The comparison considers three key performance indicators that are critical for long-term robotic deployment: computation time, system memory usage, and CPU utilization. All methods were evaluated on both the MulRan dataset and the farm dataset under identical hardware conditions. As shown in Table 3, the proposed change detection approach consistently achieves substantially lower computation time compared to the point cloud matching-based and grid division-based methods, while also reducing memory consumption and CPU usage. These results demonstrate that transforming 3D point cloud change detection into 2D image-based processing leads to significant computational efficiency gains, making the proposed approach well suited for long-term map maintenance in large-scale and dynamic environments.
Comparison table of computation time, memory usage, and CPU usage for change detection algorithms.
Comparison table of computation time, memory usage, and CPU usage for change detection algorithms.
To evaluate the practical value of the proposed framework in real robotic engineering scenarios, a systematic long-term localization evaluation was conducted using the self-collected farm dataset.
First, the complete map update and maintenance process of the proposed framework is illustrated qualitatively in Figures 5 and 6, providing an intuitive validation of its feasibility and effectiveness in dynamic environments. Figure 5 presents the front-view (FV) projection and alignment results, demonstrating the consistency of vertical structural alignment across different operational periods. Figure 6 visualizes the full Bird’s-Eye View (BEV)-based pipeline, including projection alignment, change detection, and map update.

Front-view (FV) projection and alignment results for point cloud maps acquired at different operational periods.

Bird’s-Eye View (BEV)-based map maintenance pipeline. The figure illustrates projection alignment, change detection, image-to-point cloud mapping, and the resulting updated map, verifying the feasibility of the proposed scheme in dynamic environments.
Specifically, in Figure 6, panels (A) and (B) show the BEV projection images of the point cloud maps acquired at different times. Panel (C) illustrates ORB feature extraction and homography estimation between the two projections. Panel (D) presents the change detection results obtained using the proposed image-based change detection network, where red regions indicate structures that have disappeared in the current environment and green regions correspond to newly added structures. Based on these results, image-to-point cloud mapping is performed in panel (E), where obsolete points are removed and newly observed points are added to the prior map. The resulting updated point cloud map is shown in panel (F), with a magnified view of representative changed regions provided in panel (G). Importantly, all operations are performed within the original global coordinate system, ensuring that the coordinates of previously labeled task points remain unchanged throughout the update process.
Second, the localization performance of the proposed framework is quantitatively evaluated by comparison with widely used state-of-the-art LiDAR-based SLAM systems. Specifically, Tightly-coupled LiDAR Inertial Odometry via Smoothing and Mapping (LIO-SAM) 54 and Fast-LIO2, 55 which do not incorporate map update mechanisms, and LT-Mapper, 30 which supports long-term map maintenance, are selected as benchmark methods. Localization accuracy is assessed using the Absolute Trajectory Error (ATE) metric. 19
The evaluation focuses on the robot’s seventh operational run, which was conducted approximately two months after the initial mapping and during which substantial environmental changes had occurred in the farm scenario. The corresponding localization trajectories and quantitative ATE results are reported in Figure 7 and Table 4, respectively. The results demonstrate that the proposed framework maintains accurate and stable localization performance over long-term deployment, even under significant environmental changes, thereby validating its effectiveness for practical robotic applications.

Comparison of results from the Robot’s 7th operation in the farm dataset.
Comparison of ATE in the 7th operation.
Finally, to further evaluate the long-term localization stability of the proposed framework in dynamic environments, the root mean square error (RMSE) of the Absolute Trajectory Error (ATE) is statistically analyzed over 32 operational runs conducted in the real farm environment across a nine-month period. RMSE is a widely adopted performance metric in the SLAM community and provides an objective measure of cumulative localization accuracy over time.
Figure 8 presents both a heatmap and a line chart summarizing the RMSE results for all evaluated algorithms. In the heatmap, darker colors indicate larger localization errors, while the value “

Heatmap and line chart of RMSE values for the Absolute Trajectory Error (ATE) over 32 operational runs spanning nine months in a farm environment. Darker colors indicate larger errors, while “
The LT-Mapper algorithm, which incorporates a map maintenance mechanism, is also evaluated for comparison. While LT-Mapper is able to adapt to environmental changes to some extent, its performance is consistently weaker than that of the proposed framework, primarily due to the local update limitations discussed in Section 3 and illustrated in Figure 1. This observation is further supported by the boxplots shown in Figure 9, where larger interquartile ranges indicate poorer localization stability over long-term operation.

Boxplot comparison of RMSE values for different localization algorithms over 32 runs in a farm environment. Larger values and wider distributions indicate lower localization accuracy and stability, while “
To further assess the contribution of the map update module within the proposed lifelong mapping framework, an ablation study was conducted. Figure 10 compares localization performance across 32 repeated runs under two configurations: with map update enabled and with map update disabled. When the map update module is disabled, localization error accumulates steadily over time, and localization failure occurs in later operational stages. In contrast, enabling the map update module allows the system to continuously correct deviations induced by dynamic environmental changes, thereby maintaining stable and accurate localization over long-term deployment.

Comparison of localization error over 32 long-term runs in the farm environment with the map update module enabled and disabled. Localization accuracy is evaluated using the RMSE of the Absolute Trajectory Error (ATE).
To more comprehensively quantify the localization advantages of LLMF in long-term dynamic scenarios, Table 5 presents the statistical analysis results of the Root Mean Square Error (ATE-RMSE) of the Absolute Trajectory Error for 32 farm runs. In terms of sample coverage, LLMF completed all 32 runs without any localization failure cases. In contrast, LIO-SAM and Fast-LIO2 only had 25 and 17 valid samples respectively due to localization failure in the later stages, which confirms the stability of LLMF in long-term deployments. Quantitative indicators show that the mean RMSE of LLMF is only 0.2567 m, which is much lower than 1.2701 m of LIO-SAM, 0.7652 m of Fast-LIO2, and 0.4021 m of LT-Mapper. Moreover, its standard deviation (0.0584 m) and 95% confidence interval ([0.2357 m, 0.2778 m]) are the smallest, indicating that the consistency of its localization accuracy is significantly superior to that of the comparative algorithms. Through significance testing, it can be seen that the differences between LLMF and all comparative methods reach a statistically significant level (
Statistical analysis of ATE-RMSE for 32 farm runs.
These results demonstrate that the map update module is a critical component of the proposed framework. Removing this module significantly degrades the system’s ability to adapt to environmental dynamics, leading to reduced localization accuracy and stability, particularly in long-duration and large-scale operational scenarios.
The experimental results demonstrate that the proposed lifelong mapping framework achieves robust localization performance while substantially improving the computational efficiency of change detection and map maintenance. Compared with traditional point cloud matching-based methods and grid division-based approaches, the proposed image-based change detection strategy reduces computation time from several hours to several minutes under comparable experimental conditions. This improvement can be primarily attributed to two factors: first, the transformation of computationally expensive 3D point cloud operations into efficient 2D image processing tasks; and second, the strong capability of deep learning models to capture structural changes in projected representations. From an engineering perspective, these characteristics make the proposed framework particularly suitable for long-term robotic deployment in large-scale and dynamic environments. The nine-month experimental validation conducted in a real farm scenario confirms that the framework can reliably cope with gradual and substantial environmental changes over extended operational periods, without requiring frequent manual intervention.
A key strength of the proposed approach lies in its adoption of a global-to-local map maintenance strategy. By performing change detection and updates at the global map level while preserving the original coordinate system, the framework enables incremental map evolution without introducing coordinate drift or inconsistency. As a result, existing semantic annotations, navigation targets, and task-specific reference points defined in the initial map remain valid throughout long-term operation, eliminating the need for costly re-annotation. This design choice distinguishes the proposed framework from many existing long-term mapping approaches that rely on local or incremental updates, which may accumulate alignment errors over time. By maintaining coordinate system consistency, the framework provides a practical pathway toward sustainable, long-term autonomy, supporting continuous robotic operation in environments subject to ongoing structural changes.
Although the proposed approach effectively ensures accurate long-term robot localization in dynamic environments, it still has certain limitations: The core design idea of this paper is to project two PCD point cloud maps acquired at different times into Bird’s-Eye View (BEV) and Frontal View (FV) 2D representations respectively. Through feature registration and difference comparison, environmental change regions are accurately identified, thereby realizing the selective update of the prior map. This update mechanism adopts a post-task processing mode, which needs to be completed offline after the robot finishes a single task and cannot achieve synchronous update during task execution. The applicability of the above design idea is significantly limited in two types of scenarios: First, extremely dynamic environments, such as scenes with frequent cargo handling and rapid movement of temporary obstacles in warehouses. The offline update mode is difficult to respond to the high-frequency changes of the environment in real time; second, scenes where extreme weather causes large-scale sudden changes in spatial geometric characteristics, such as sudden heavy snow covering the original terrain and heavy rainfall leading to changes in surface morphology. Such drastic changes will lead to a significant reduction in feature matching points for projection registration, thereby affecting the reliability of change detection and map update. It should be supplemented that the problem of map maintenance and localization under such extreme change scenarios is still an incompletely solved technical challenge in the field of LiDAR-localized robots. In future engineering applications, efficient map maintenance and updates can be achieved by adding QR codes, magnetic induction, or adopting RTK-aided positioning in outdoor scenarios free from magnetic field interference.
Conclusion
This paper addresses the degradation of robotic localization performance caused by the failure of prior maps in long-term dynamic environments. To this end, a modular lifelong mapping framework (LLMF) is proposed, which performs map maintenance through change detection based on point cloud projection into the image domain. The framework is designed to support long-term robotic operation without altering the global coordinate system of the original map.
The main contributions of this work can be summarized as follows: A complete lifelong mapping framework is developed by integrating global localization, multi-view point cloud projection alignment, image-based change detection, and point cloud map update into a unified system. This modular design enables the systematic perception of environmental changes and the automated maintenance of global maps during long-term operation. An efficient integrated strategy for point cloud registration and change detection is proposed, consisting of three collaboratively designed algorithms: Algorithm 1 projects 3D point clouds into 2D representations and accurately estimates the relative spatial relationships between maps from different periods through feature-based registration; Change detection algorithm leverages a deep learning-driven change detection model to efficiently identify structurally altered regions in the environment; Algorithm 2 fuzes the outputs of the previous two, back-projects image-level differences into the point cloud space, and performs precise addition and removal of changed point clouds while preserving the original global coordinate system, thereby achieving adaptive map maintenance. This strategy completely avoids the high computational cost associated with direct 3D point cloud processing and significantly improves the computational efficiency and practicality of map maintenance in long-term dynamic environments. Extensive experimental evaluation is conducted using both an open-source dataset and a long-term real-world farm dataset spanning nine months. The results demonstrate that the proposed framework preserves localization accuracy before and after map updates while reducing the computation time required for change detection from several hours to several minutes, highlighting its suitability for practical engineering deployment.
Regarding the two fundamental challenges in lifelong mapping, the solution effectiveness is clearly specified as follows: Fully Addressed Challenge: Accurate Alignment of Point Cloud Map Coordinate Systems Across Different Periods and Maintenance of Coordinate Consistency LLMF innovatively adopts a BEV+FV multi-view projection fusion strategy, combined with ORB feature matching and homography matrix estimation, enabling high-precision registration without the need for re-annotating semantic information. The “global-to-local” update mechanism fundamentally eliminates coordinate drift and map ghosting. In 32 farm runs, the registration error is only at the centimeter level, with the narrowest 95% confidence interval (
Future work will focus on extending the robustness and applicability of the proposed framework. In particular, we plan to investigate more resilient projection and alignment strategies for handling extreme environmental changes, to explore multi-modal change detection methods that incorporate additional sensory cues, and to extend the framework to multi-robot scenarios to support distributed lifelong map maintenance.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Sichuan Science and Technology Program (2025YFHZ0103, 2025HJRC0022, 2023NSFSC1985), and Jiangsu Distinguished Professor Programme.
Conflicts of interest
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
