Sage Journals: Discover world-class research

Abstract

A reliable autonomous navigation system in environments with complete darkness, fog, and smoke is still an open problem in the field of mobile robots, where traditional computer vision approaches are not applicable because they heavily rely on the illumination and sightline conditions. This paper proposes an embedded hardware framework based on the ROS platform that provides reliable environmental perception and decision-making using the fusion of the thermal and LiDAR sensors, along with the machine learning-based perception module. The proposed framework is implemented using the Raspberry Pi 4 platform to show the possibility of using such advanced multi-sensor fusion techniques on embedded hardware devices. The embedded hardware framework provides a spatiotemporal environmental model using the geometric information obtained from the 2D LiDAR sensor and the thermal intensity information. Real-time obstacle detection and classification are carried out using a YOLOv5-based CNN on the fused stream of data, ensuring accurate detection even under poor lighting conditions. Experimental evaluation of the proposed approach under complete darkness, smoke, and fog showed 96%–97% accuracy, achieving better performance than thermal-only and LiDAR-only approaches by 12% and 8% respectively. The proposed fused framework is found to be feasible for real-time operation on Raspberry Pi 4 devices with a 100% success rate.

Keywords

autonomous navigation thermal LIDAR fusion embedded hardware systems ROS low-visibility environments

Introduction

The robustness of autonomous navigation in degraded visual conditions such as fog, smoke, dust, and illumination is still a major concern for robotics systems. The robustness of vision-based SLAM systems is reduced in illumination changes and feature degradation.^1–3 The illumination-independent perception of thermal sensors is reduced by temperature contrast variability and environmental factors.^4–6 The precise geometric mapping of LiDAR sensors is reduced in dense fog and smoke conditions by scattering and attenuation effects.^7–9 The recent developments of multimodal perception are focused on integrating different sensors for robustness improvement.^3,10,11 However, existing systems are either GPU-based or not implemented on hardware platforms.¹² To overcome these drawbacks, this research proposes a feature-level LiDAR-Thermal fusion framework under an entire ROS-based autonomous navigation system. This research has several contributions to the robust autonomous navigation under poor vision conditions. First, this research proposes an embedded hardware framework, which can be used to integrate LiDAR geometric features and thermal intensity features, improving the reliability of perception under poor vision conditions such as fog, smoke, and low illumination. Second, this research proposes the fusion framework under an entire SLAM-based autonomous navigation system. Third, experimental evaluations in smoke, fog, and darkness conditions have been successfully accomplished with ∼97% detection accuracy and 100% navigation success, where real-time feasibility was also demonstrated on Raspberry Pi 4 platforms with an average latency of 0.5 s. The rest of this paper is structured as follows. Section 2 discusses the Related Work. Section 3 explains the Proposed System Design, including the overall architecture and integration framework. Section 4 explains the Implementation of the ROS-Based Autonomous System. Section 5 explains the Mathematical Formulation of the Proposed Thermal–LiDAR Fusion Framework. Section 6 explains the Experimental Analysis. The Results and Discussion section, Section 7, presents the machine learning-based object detection, analyzing the performance of the system in adverse conditions such as smoke, fog, and low light. Section 8 presents the Limitations and Future Work, and finally, Section 9 concludes the paper.

Related work

Vision-based navigation under adverse conditions

Schichler et al.⁷ examined the degradation of camera-based SLAM system performance in low visibility environments, which led to considerable feature loss and instability in tracking. Luong et al.¹³ continued the study on the challenges faced in the robustness of perception, which is associated with the limitations of using vision-based SLAM in dynamic environments.

Thermal-based perception systems

Thermal imaging has been studied to improve perception in smoke and dark conditions. Research in¹⁴ showed that the detection ability of a human in obscurant conditions using thermography was improved. However, in outdoor conditions, the detection reliability was reduced due to variations in temperature gradient and fog density, as reported in.¹⁵

LiDAR-based approaches

LiDAR sensing has demonstrated its robustness in low light navigation scenarios.¹⁶ However, in the analysis of the effects of backscatter in dense fog and smoke environments by Hevisov et al.,¹⁷ it was noted that attenuation has a considerable impact on the reliability of depth.

LiDAR–thermal fusion strategies

Existing multimodal fusion architectures that incorporate both LiDAR and thermal sensing have exhibited better performance in localization and detection. For instance, Haghbayan et al.¹⁸ have demonstrated better performance in underground localization by incorporating both LiDAR and thermal sensing. Fritsche et al.¹⁹ have achieved better performance in pedestrian tracking by incorporating multimodal fusion. To better compare the performance of the proposed system with that of the existing LiDAR-thermal fusion architectures, Table 1 lists recent descriptive studies^20–24 with respect to sensing modalities, performance, computational complexity, and system-level integration.

Table 1.

Advances in low-visibility navigation and perception.

Research focus	Key contribution/finding	Research gap
Foundation & benchmarking²⁰	Established the critical failure mode of vision-based systems in adverse conditions using large-scale benchmarks, setting the baseline for the field.	While foundational, the sensors and algorithms are dated. New benchmarks for multi-modal adverse weather are needed.
Edge-computing for robotic navigation²¹	Demonstrated a lightweight graph-neural-network for sensor fusion on a Jetson Orin, enabling real-time navigational reasoning with a 30% reduction in latency.	Focused on computational efficiency, with less emphasis on extreme low-visibility performance.
Thermal-specific perception models²²	Proposed “Thermal-YOLOv5,” a domain-adapted model pre-trained on a massive thermal image dataset, significantly improving detection of low-heat-signature objects.	Performance is contingent on the quality and variability of the thermal training data. Struggles with thermal crossover.
Resilient LiDAR processing²³	Developed a novel de-noising algorithm that uses temporal consistency to filter out fog and rain particles from LiDAR point clouds in real-time.	Effective for dynamic obscurants but less so for persistent, dense static fog. Can occasionally filter out small, genuine obstacles.
End-to-end navigation systems²⁴	Demonstrated a completely autonomous solution using LiDAR-inertial SLAM, although the failure rate in a smoke environment was high since the solution does not incorporate thermal perception.	Explicitly identifies the lack of thermal sensing as a critical system-level limitation, creating a direct opportunity for this work.

As can be noted from Table 1, in previous works, significant improvements in the robustness of perception have been demonstrated using multimodal fusion-based solutions. Nevertheless, some limitations can still be noted. Most solutions are based on computationally complex models, such as GPU-based implementations using deep learning, which may not make them feasible for embedded hardware platforms. Another limitation is that most works focus on demonstrating the validity of the solution from the algorithmic perspective, while the integration and validation in a completely autonomous solution are not addressed. The validity in a real-time embedded hardware solution, especially in a smoke and fog environment, is also not addressed in most works.

Recent studies have explored advanced artificial intelligence techniques to enhance perception and decision-making in autonomous systems. Reinforcement learning approaches for robot navigation and adaptive detection systems have been proposed alongside emerging multimodal perception frameworks such as transformer architectures, multimodal large language models, and vision–language models for object detection and scene understanding in intelligent transportation and autonomous driving applications.^25–31

Research gap in embedded hardware thermal–LiDAR fusion frameworks

The ability to navigate through fog, smoke, rain, dust, and low light conditions is still a major challenge in the development of autonomous systems in defense, industrial inspection, disaster relief, and transportation applications. While vision-based systems work very well in clear conditions, their performance drops significantly in the presence of obscurants, and there is a need to develop robust multi-modal sensing systems. Prior studies have examined benchmarking, deep fusion models, thermal-domain perception, LiDAR enhancement, and embedded hardware real-time pipelines. Although recent advances include thermal-aware detection, LiDAR de-noising in scattering media, and transformer-based fusion networks, key limitations persist-particularly reduced robustness under severe ambiguities, high computational demands for embedded hardware platforms, and limited integration of thermal sensing into full navigation stacks. Table 1 summarizes representative contributions and the remaining gaps most relevant to this work. Although LiDAR–thermal fusion has been explored, existing approaches still leave several important limitations unaddressed. Many systems emphasize algorithm-level fusion or rely on GPU-class hardware, limiting suitability for lightweight or field-deployable robots. Several pipelines operate only as standalone perception modules rather than being integrated into complete navigation frameworks, reducing their ability to support real-time planning and control. Experimental validation is also often narrow, with limited side-by-side comparisons of fused, thermal-only, and LiDAR-only sensing under controlled smoke, fog, and darkness. On the contrary, the new system embarks on a feature-level fusion flow that is optimized for embedded hardware, inclusion of fused perception within a full ROS navigation stack, as well as extensive evidence within experiments that report improvements within detection accuracy and navigation.

Novelty of the proposed system

Although 2D LiDAR ranging, thermal imaging, YOLO detection, and ROS navigation are individually well explored in robotics, their culmination in a lightweight, deployable form targeted at low-visibility conditions distinguish this work. The system centers on an embedded hardware strategy that correlates thermal intensity with LiDAR-derived geometry in real-time on a Raspberry Pi 4. Unlike previous approaches that relied on GPU-class machines or isolated perception modules, this framework integrates fusion fully within a ROS navigation stack and is experimentally validated in smoke, fog, and darkness. The novelty is based on its system-level integration and hardware feasibility for embedded hardware platforms.

Bridging the computational efficiency gap

Although transformer-based fusion models show promising accuracy, they are not suitable for low-power platforms due to computational complexity. However, the proposed work is based on implementing a lightweight embedded hardware system on Raspberry Pi 4.

Transitioning from algorithm to integrated system

The majority of the previous work is based on showing the novelty of the algorithm rather than its integration with the robotic system as a whole. However, the proposed work is based on implementing and testing an end-to-end system for the rover and showing its feasibility.

Addressing the perceptual shortfall in obscurants

Research has also shown the absence of thermal perception as a major failure mode in visually degraded environments such as smoke, and this system includes a thermal camera as a primary sensor for perception, addressing the thermal awareness needed to safely operate in dark and obscurant environments, which is a known limitation of LiDAR-only-based implementations.

Providing quantitative comparative validation

Clearly, there is a need for empirical comparisons of the fused sensing approach and individual modalities under sustained adverse conditions. This paper presents a controlled side-by-side evaluation of the approach and demonstrates its potential for a 2.0%–8.0% improvement in detection accuracy and navigation stability compared with thermal only and LiDAR only approaches.

Demonstrating real-time performance on accessible hardware

One of the common bottlenecks in the deployment of such systems is running perception and navigation on low-power embedded platforms. Real-time performance was demonstrated on the proposed system, with an average latency of 0.5 s on Raspberry Pi 4, thus proving the feasibility of the system under embedded platform constraints. However, the main contribution of the proposed system was the design of the embedded hardware framework, which was integrated into a complete ROS navigation stack, unlike the existing architectures, which used a single dominant sensor.

As can be seen in Figure 1, this conventional and common approach to existing designs involves processing information received from a primary sensor (for instance, a LiDAR sensor for geometric information or a thermal camera for thermal information) through an exclusive perception module. A final product in the shape of a list of obstacles or a basic map is then sent to an independent planning and control stack. Although this approach has proven to be useful, its inherent flaw is that there is no deeper level of integration through algorithms, which does not overcome the inherent flaws of the sensors used and makes them less effective in extreme conditions such as heavy smoke or low thermal contrast scenes. The proposed system architecture in this paper aims to solve this flaw by introducing a central fusion core, as explained in Section 3.

Figure 1.

High-level architecture of the proposed navigation system.

Proposed system design

A system will be proposed that will be able to provide robust navigation in low visibility conditions using the combination of Thermal and LiDAR information in a modular ROS design. The Sensing, Processing, and Control layers make up the proposed system architecture, as shown in Figure 2.

Figure 2.

Data flow pipeline of the implemented rover, showing only deployed modules with hector SLAM as the mapping component.

Standard architecture of existing LiDAR-thermal navigation systems

Although recent versions of the YOLO series have shown improved accuracy in object detection, their increased computational complexity has resulted in increased inference time on CPU-based embedded platforms. However, as the primary requirement in this work is real-time autonomous navigation with limited resources, YOLOv5 was chosen based on its good balance of accuracy and real-time performance. However, the proposed fusion framework can be extended to other versions of the YOLO series based on increased computational resources.

Fusion of LiDAR and thermal imaging

The fusion framework starts with temporal synchronization, where thermal frames and LiDAR scans are synchronized based on ROS timestamps. As shown in Figure 2, an approximate time synchronizer is employed to match each thermal frame with its nearest LiDAR scan, ensuring that information is collected from the same time instance for both modalities and minimizing any possible differences due to different frame rates. Once temporal synchronization is complete, spatial synchronization is carried out by transforming the LiDAR point cloud to match the coordinate frame of the thermal camera based on extrinsic calibration parameters. This process is also shown in Figure 2. Once the synchronization process is completed, the embedded hardware will be conducted. The thermal imagery will be used for the provision of radiometric descriptors, while the LiDAR sensor will be used for the provision of geometric descriptors. The descriptors will be used as a unified feature descriptor for each point of the LiDAR sensor and the corresponding thermal pixel representation. The representation will be used for the provision of obstacle detection and cost maps for the ROS navigation system, which will be useful for operation in environments with challenging conditions such as smoke, fog, and darkness.

In addition to multimodal fusion approaches, deep learning–based object detection models are commonly integrated into navigation systems to enhance obstacle recognition capability. Among these, the YOLOv5 architecture is widely adopted due to its lightweight design and real-time inference performance. In this work, a modified version of YOLOv5 is employed for thermal-image–based obstacle detection, with implementation.

SLAM method used

In this work, Hector SLAM is used as the primary mapping algorithm for the 2D LiDAR. Hector SLAM was selected due to its suitability for lightweight embedded hardware platforms and its ability to operate without wheel-odometry input, which aligns with the constraints of the rover platform. Although ROS supports multiple SLAM packages such as Gmapping, only Hector SLAM was used in the implemented system and in all reported experiments.

Embedded hardware of LiDAR geometric descriptors and thermal intensity

In the proposed work, embedded hardware combines LiDAR geometric descriptors with thermal intensity values. The LiDAR features include the raw distance measurements and local contour-shape descriptors computed from adjacent scan points. These contour-shape descriptors represent the geometric pattern formed by the LiDAR beams-such as curvature, angular change, and segment orientation-and indicate whether the scanned surface corresponds to a flat wall, a corner, or a rounded obstacle. These spatial features are aligned with the corresponding thermal regions to form a fused feature vector used for mapping and obstacle detection

Implementation of the ROS-based autonomous system

This system was realized on a custom mobile rover platform whose mechanical design³² included a central Chassis forming the Main Body, supported by a rigid Support Frame and a Suspension system connected with Wheels/End Effectors for its mobility as displayed in Figure 3. The basic processing equipment used in this system includes a FLIR A65 thermal camera and a YD LiDAR X2 sensor. They were placed appropriately on this chassis so that their fields of vision matched and overlapped. A Raspberry Pi served as the data fusion and navigation internal processing unit within the ROS technology framework, and these components were interfaced for optimal positioning through a mechanism of gears/joints and actuators. In addition, a navigation node used this sensing output to improve path planning and motion control using the ROS “Move Base” layer. Most importantly, the functionality of the system has been coupled with an intensive calibration process for ensuring world positioning by precisely aligning the integration frames of the LiDAR and the thermal camera based on the temporal coherence observed across the data streams.

Figure 3.

Physical rover setup with integrated LiDAR and thermal camera.

Extrinsic calibration between LiDAR and thermal camera

Extrinsic calibration between the LiDAR sensor and the thermal camera was conducted to establish accurate spatial alignment between geometric and radiometric data. A custom calibration target was prepared by heating a checkerboard pattern to create strong thermal contrast detectable by the infrared camera, while reflective markers were attached to the same board to ensure clear geometric detection by the LiDAR. Multiple synchronized observations were recorded from different viewpoints by moving the rover around the calibration target to capture sufficient feature diversity. Cross-modal feature correspondences were then identified between LiDAR point clusters and thermal image features using a feature-matching procedure. The rigid body transformation between two sensor coordinate systems was determined by solving a least squares optimization problem to obtain the rotation matrix R∈ R^3 × 3 and translation vector t∈ R³. The optimization problem minimized the re-projection error between LiDAR points projected onto the image plane of the thermal camera and the thermal features. The accuracy of the calibration process was verified by projecting the LiDAR points onto the thermal images in various test scenes. The average re-projection error measured in this process was less than 2 pixels.

Software architecture and data processing pipeline

The software framework as depicted in Figure 4 was developed with the ROS Noetic software development kit for the purpose of coordinating the data acquisition, synchronization, and processing, as well as decision-making. Within the system architecture, various ROS nodes³³ have been developed with the primary intent of working collaboratively toward achieving an efficient data handling and real-time environmental perception. The thermal image processing module received the raw thermal images and undertook the necessary pre-processing steps such as noise removal and contrast normalization. The LiDAR point cloud processing module processed the 2D LiDAR scans received, where noise and outlier points were filtered out before key geometric information was obtained. Processed information from these modules was fused in the feature extraction and fusion module, where thermal and LiDAR information streams were fused in the spatiotemporal domain to obtain a unified model of the environment. Relevant thermal and spatial information for higher-level analysis was obtained from this model. In obstacle detection and classification, a CNN-based YOLOv5 algorithm was employed with a custom dataset of obstacles and low-visibility situations.

Figure 4.

Block diagram of the human detection pipeline using YOLOv5.

This model was used to identify and classify the obstacles in real-time and allowed the rover to perceive and navigate effectively.

Experimental environment, obstacle diversity, and accuracy results

The experiments carried out in a controlled indoor environment with static, dynamic, low contrast, and high contrast obstacles under smoke, fog, and darkness conditions have shown 96.8%, 97.2%, and 97.0% accuracy, respectively. These experiments have shown that the framework is stable and works well under different degraded visibility conditions.

Robustness and navigation performance

The fusion framework retained ∼96%–97% accuracy in the presence of different thermal contrast and obstacle materials, proving to be robust to environmental changes. In terms of navigation, the approach was 100% successful in a controlled environment of a 15 m course, with plans to extend this to outdoor use cases in the future.

Feature extraction from LiDAR and thermal imaging

The proposed system extracts significant geometric and thermal descriptors from each of these sensors before fusion. This sub-section explains the preprocessing and feature computation steps for each of these sensors before fusion. The following Figure 5 explains the overall data flow for the proposed fusion system. First, LiDAR scans and thermal image are synchronized. Geometric descriptors are extracted from LiDAR, and thermal descriptors are extracted from thermal images. These descriptors are fused and processed using a modified YOLOv5 algorithm for obstacle classification. These decisions are made using ROS framework for obstacle avoidance.

Figure 5.

Flowchart of the embedded hardware algorithm.

The proposed embedded hardware approach works on the principle of mapping the geometric descriptors obtained from LiDAR onto the thermal image domain. For each LiDAR measurement of type (r_i, θ_i), local discontinuity and curvature are calculated as structural features to account for spatial changes. Using Camera-LiDAR calibration, each 3D point of LiDAR is projected onto the thermal image plane, and thermal intensity information is extracted as features. The aggregated set of features (F_t) is then used as part of the SLAM-based navigation framework for enhanced robustness in poor visibility conditions. The embedded hardware is mathematically represented as follows:

	Algorithmic flow (pseudo-code)
Step 1	Input: LiDAR scan L_t, Thermal image T_t, Calibration matrix K
Step 2	Output: Fused feature set F_t
Step 3	for each LiDAR point (r_i, θi) in L_t:
Step 4	(u_i, v_i) ← ProjectToThermalPlane(r_i, θ_i, K)
Step 5	g_i ← [r_i, local_discontinuity(r_i), curvature(r_i)]
Step 6	h_i ← ThermalIntensity (T_t, u_i, v_i)
Step 7	f_i ← Concatenate (g_i, h_i)
Step 8	Add f_i to F_t
Step 9	return F_t

LiDAR feature extraction

The original 2D LiDAR scan data was initially de-noised using a statistical outlier removal filter, which removed false echoes from the scene, including those from the presence of fog, smoke, and shiny surfaces, as per point cloud filtering best practices.³⁴ Each filtered scan was then mapped to a structured geometric feature description in terms of range-bearing pairs, local curvature, and boundary discontinuities, which are extensively used for the identification of obstacle contours in.³⁵ To increase resistance in a dynamic low visibility scenario, temporal stability features were found in the comparison of point-to-point displacements as well as contour persistencies, as used in classical scan matchers.³⁶ The described geometric feature set efficiently represents a compact platform amenable to embedded hardware systems.

Thermal image feature extraction

The thermal frames captured using the FLIR A65 thermal camera were pre-processed with the application of temperature normalization, histogram equalization, and median filtering techniques in conformity with the thermal imaging guidelines.³⁷ From the processed thermal frame, regions corresponding to the heat signatures were detected by the application of adaptive thresholding techniques, widely used in heat signature detection systems for human target identification.³⁸ The system calculated the mean temperature values, thermal gradients, area values for blobs, and the shape information for each region identified for their ability in distinguishing heat signatures from the background objects.³⁹

Cross-sensor alignment and feature fusion

To incorporate the geometric and thermal data, the scan angles of the LiDAR were projected to the coordinates of the thermal image based on the extrinsic parameters obtained from a camera and laser calibration process.⁴⁰ This allowed the generation of a combined feature vector attending to the geometry and intensity of the LiDAR data following the multi-modal fusion approaches illustrated by previous works on the fusion of LiDAR and thermal sensors.^3,41 This combined feature vector was finally fed into the YOLOv5 detector.

System integration

The last step was a full integration and synchronization of the hardware and software so that all would work in harmony. The FLIR A65 and YD LiDAR X2 sensors were interfaced to the Raspberry Pi 4 using USB and dedicated communication ports to ensure reliable data transmission. Custom ROS drivers were implemented for continuous and secure data streaming from both sensors to the central processing unit. All of the designed ROS nodes have been configured onto the Raspberry Pi, and the ROS concepts and services were used to control inter-node communication. This architecture enabled it possible for the sensing, processing, and decision-making modules to coordinate their activities and share data in real-time. A thorough calibration procedure was used to achieve precise Thermal-LiDAR sensor alignment. To generate a pixel-to-pixel and spatial links among data sources, intrinsic and extrinsic parameters were adjusted. Later, the integrated system was tested under a controlled indoor environment to validate obstacle detection accuracy and navigation performance in advance of final deployment to the field.

Sensor calibration

To ensure that the LiDAR measurements and thermal images were spatially consistent before fusion, a simple calibration procedure was carried out within the controlled indoor environment shown in Figure 5. The thermal camera’s intrinsic parameters, including focal length and principal point, were taken from the manufacturer’s model because low-resolution thermal sensors do not reliably detect standard calibration patterns. The relative pose between the LiDAR and thermal camera was estimated using a planar target placed at different positions in the test area, allowing LiDAR scan endpoints to be associated with corresponding regions in the thermal image. A rigid transformation was then computed to align the two sensing frames. Calibration was verified qualitatively by analyzing the superposition of contours from the LiDAR sensor onto the thermal image frames, where it was ensured that the projected geometry is invariant with respect to multiple viewpoints. Such a process ensured that the alignment level is adequate for performing the fused mapping and obstacle detection task under reduced visibility environments.

Projection-based fusion of 2D LiDAR scans with thermal pixels

In order to combine the geometric data available from the 2D LiDAR sensor with the thermal intensity data, each LiDAR scan point is projected into the thermal camera coordinates according to the estimated extrinsic parameters. This projected point is then mapped to the thermal image plane, and the corresponding intensity value is sampled from the thermal image. This results in an aligned pair comprising the geometric descriptor extracted from the LiDAR data, and the thermal data. As this process is repeated for all valid LiDAR beams, a feature vector is created that associates the local contour shape data to the thermal data describing the corresponding region in the scene. This process bypasses the need for any tile heuristics and allows both modes to contribute to a unified spatial representation.

Fused-perception contribution versus planner novelty

In this research, the layer of navigation is dependent on the ROS Move-Base standard framework, which is deliberately retained in its original form without modifications. This ensures that when better results are obtained in terms of navigation reliability, it is solely due to the LiDAR and thermal perception system and not because of modifications in the specific planners. By using the robust Move-Base system as the base, it is proven that improved perception capability is also enough to increase the performance of ROS in terms of navigation and without being dependent on complex algorithms or systems.

Mathematical formulation of the proposed thermal–LiDAR fusion framework

This section presents the mathematical formulation of the proposed embedded hardware framework used for robust perception and autonomous navigation in degraded visibility environments.

Sensor data representation

At time instant t, the two-dimensional LiDAR sensor provides a scan represented as equation (1).

L (t) = {(r_{i}, θ_{i})}_{i = 1}^{N}

(1)

where r_i_and θ_i denote the range and bearing of the ith beam, respectively. This polar representation is commonly adopted in mobile robotic perception and mapping applications.^42,43

Simultaneously, the thermal camera captures a long-wave infrared image expressed as equation (2).

T (t) \in R^{H \times W}

(2)

where H and W denote the image height and width, and pixel intensities correspond to temperature-dependent radiometric values. Such representations are widely used in thermal perception systems for robotics and autonomous navigation.

Temporal and spatial alignment

To ensure consistency between heterogeneous sensor streams, temporal synchronization is enforced as equation (3).

| t_{L} - t_{T} | \leq ϵ

(3)

where t_L and t_T are the acquisition times of LiDAR and thermal data, respectively, and ϵ denotes an acceptable synchronization tolerance. Similar temporal alignment strategies have been adopted in recent multi-sensor fusion frameworks.⁴⁴

Spatial alignment between LiDAR and thermal sensors is achieved using a rigid-body transformation defined as equation (4).

P_{T} = {RP}_{L} + t

(4)

where R ∈ R^3 × 3 is the rotation matrix and t ∈ R³ is the translation vector obtained through extrinsic calibration. This formulation is standard in recent LiDAR–thermal and LiDAR–vision calibration approaches.⁴⁵

Feature extraction

From the LiDAR scan, geometric features are extracted as equation (5).

F_{L} = [d_{mean}, d_{var}, n_{points}]

(5)

where d_mean and d_var represent the mean and variance of obstacle distances, and n_points denotes the number of valid returns within a region of interest. Statistical geometric descriptors of this form are commonly used in LiDAR-based perception pipelines.

From the thermal image, radiometric features are computed as equation (6).

F_{T} = [T_{mean}, T_{\max}, T_{std}]

(6)

where T_mean, T_max, and T_std denote the mean, maximum, and standard deviation of thermal intensities. Such thermal descriptors are widely used for robust human and obstacle detection in low-visibility environments.

Embedded hardware

The extracted features are combined at the feature level to exploit complementary geometric and thermal information. The fused feature vector is defined as equation (7).

F_{fusion} = F_{L} ∥ F_{T}

(7)

where || denotes feature concatenation. Embedded hardware using concatenation has been shown to provide an effective balance between robustness and computational efficiency in embedded hardware systems.

Alternatively, a weighted fusion strategy is adopted as equation (8).

F_{fusion} = α F_{L} + β F_{T} with α + β = 1

(8)

where α and β control the relative contributions of LiDAR and thermal features. Similar weighted fusion formulations have been employed in recent adaptive multi-sensor perception models.

Learning-based perception and navigation integration

The fused feature vector is provided as input to a learning-based perception module defined as equation (9).

\hat{y} = f (F_{fusion})

(9)

where f (·) represents the trained detection network and ŷ denotes detected object classes and confidence scores. Learning-based fusion models have demonstrated improved robustness in adverse and low-visibility environments.

Detected obstacles are projected onto a two-dimensional occupancy grid map used by the navigation stack, expressed as equation (10).

M (x, y) = {\begin{matrix} 1, \\ 0, \end{matrix} \begin{matrix} if obstacle detected \\ otherwise \end{matrix}

(10)

This representation is subsequently used for path planning and motion control within the ROS navigation framework, consistent with recent embedded hardware autonomous navigation systems.

Experimental analysis

For making sure that the performance analysis on the newly developed LiDAR-thermographic navigation system is transparent and repeatable, care was taken to strictly control the testing environment in each test. As already mentioned, obscurants like smoke or fog may be quite dynamic in terms of density. Some additional work has been done to smoothen out obscurant densities, and a perfectly repeatable test environment has been maintained. Details on this work are given in the next subsection.

YOLOv5 configuration and training

In this work, the YOLOv5 network architecture is retained, while task-specific training and deployment optimizations are applied to support real-time embedded hardware execution. The obstacle detection component was built using the YOLOv5 model with minor architectural adaptations to support thermal image input and improve robustness under low-visibility conditions. These modifications were designed to maintain computational efficiency suitable for the onboard processor while preserving the core backbone and detection head structure of the original YOLOv5 architecture. The training of the model followed the standard YOLOv5 training procedure, with hyper parameters initialized to their default settings. The training set consisted of images from the indoor testbed (Figure 6), along with additional obstacle examples sourced from publicly available datasets. Standard data augmentation techniques were applied to enhance robustness and generalization. This configuration ensures reproducibility of the training process while enabling real-time deployment on the target hardware platform.

Figure 6.

Indoor experimental setup for low-visibility navigation trials.

Experimental setup, model training and performance analysis

To do experimental verification of the new system, experiments were conducted indoors where realistic low-visibility conditions were simulated. The experiment site had a 15 m obstacle course which included static and dynamic obstacles of varied sizes and materials. To simulate low-visibility conditions, complete darkness without any ambient illumination source, smoke conditions made by a smoke machine, and fog conditions made by a high-output fog machine were employed. The rover had to perform autonomously in this course from a specific starting point to an end point without hitting any obstacles. In every trial, data that included but was not limited to obstacle detection events, navigation paths, and traversal time was continuously recorded for later post-processing and performance evaluation. The performance of the object detection model, important for the obstacle avoidance system of the rover, was evaluated with standard training and validation metrics. It has shown successful convergence with high proficiency in locating objects and classifying them well. The thermal-only and LiDAR-only baselines were included as controlled ablation studies to quantify the individual contribution of each sensing modality within the same hardware, environment, and navigation stack. These baselines are not intended to represent state-of-the-art perception systems, but rather to provide a consistent reference for evaluating the benefit of the fused LiDAR–thermal pipeline. Prior sensor-fusion studies have similarly reported that combining geometric and radiometric cues improves robustness in fog, smoke, and low-light conditions, and our results follow the same trend. While the proposed system does not aim to outperform advanced fusion architectures, it demonstrates that a lightweight, embedded hardware implementation can achieve reliable navigation in degraded visibility. This positions the evaluation as a practical assessment of fused sensing on constrained hardware rather than a comparison against high-complexity fusion models.

Environmental setup and control

To attain the same condition of low visibility as replicated in all tests, the experiment was conducted in a controlled indoor environment with controlled airflow. The smoke and fog effects were achieved using the smoke and fog machines available in the market with varying levels of generation. The density of the obscurants in all scenarios was maintained at the same level by running the smoke generator for 3 min for smoke generation and 4 min for fog generation until the desired visibility range of about 1.5 to 2.0 m was achieved, as measured by the visibility chart calibrated to show the distance from a point placed at varying distances on the chart. The complete test environment layout with the smoke generator, visibility chart, static and dynamic barriers, and rover initial position is displayed in Figure 6, giving a real-time demonstration of the experiment environment. Each experiment was performed under the same environmental conditions for a total of five times. The experiment region was flushed with airflow to remove the remaining obscurants present from the previous experiments and needed another 3 min of smoke generation to restore the required conditions of density before proceeding with the next iteration of the experiment. The position of the initial rover and the navigation goal remained the same for all experiments. The variability of the environment is further reduced by turning off fans, air conditioning systems, and natural sources of illumination to reduce turbulence and thermal drifts. The temperature and humidity of the environment are also recorded and controlled to maintain thermal contrast. Any experiment whose results show uneven dissipation of smoke or fog is discarded and repeated. These conditions ensure that the stability of the experiments is maintained and that the performance evaluation of the proposed system is fair and reproducible. Figure 7(a) to (d) shows the complete training and validation of the modified YOLOv5 model, and each of these subfigures represents some important aspect of the learning process of the model. Figure 7(a) shows the bounding box regression loss plot, and it is evident that it is reducing steadily with each epoch of training, ensuring that the model is improving its capacity to correctly locate obstacles with higher spatial accuracy.

Figure 7.

Live thermography image stream in a smoke filled environment test: (a) convergence of bounding box regression and objectness loss, (b) classification loss and precision curves, (c) validation bounding box error reduction, and (d) validation error stability, and mAP values with improved detection and localization accuracy.

Figure 7(b) shows the objectness loss plot, and it is also reducing steadily with each epoch of training. The performance of the generalization is shown in Figure 7(c), and it can be seen that the validation bounding box loss (val/box_loss) decreases significantly, proving that the model can learn to generalize to unseen data sets very effectively. This sub-figure also illustrates the performance of the model in terms of recall, proving that the model can effectively detect most of the obstacles in the environment. Figure 7(d) illustrates the precision metric, and it can be seen that this metric improves significantly as the model is trained, reaching near-perfect levels, proving that the model can effectively detect positive instances with very few false positives and also illustrates the performance of the model in terms of the Mean Average Precision metric, proving that the model can accurately detect bounding boxes with high mAP_0.5 and mAP_0.5:0.95 scores. Overall, Figure 7 illustrates the performance of the model in terms of its stable and effective learning behavior, proving that the final model is robust and accurate and can be effectively used to detect obstacles in the environment.

Machine learning training and evaluation setup

The YOLOv5-based detector module was trained on a dataset of 12,500 thermal images, including 7500 images obtained through controlled experiments in a smoke chamber and 5000 images from publicly available datasets. The dataset was divided into 80% training and 20% validation sets, with balanced data from smoke, fog, and darkness conditions. Data augmentation methods like mosaic augmentation, flipping, scaling, and thermal jittering were used. Performance was evaluated on the validation set using precision, recall, F1-score, and confusion matrices, ensuring robust assessment across diverse low-visibility conditions.

Obstacle detection accuracy

The evaluation on obstacle detection was conducted under three visibility scenarios: darkness, smoke, and fog. The sensing technologies showed different advantages under different scenarios. The FLIR A65 Thermal Camera system showed 92.0% detection in darkness due to high radiometric contrast with insufficient geometric information. However, it showed 89.0% detection in smoke and 85.0% in fog, with decreased performance because of light scattering by particles in the air. The YD LiDAR X2 system has shown consistent results at 95.0% in all scenarios because it performs equally under different lighting conditions. Also, when combined, the LiDAR-Thermal fusion system achieved 97.0% detection because it utilized its strong capability of LiDAR in detailed geometry and its capability of Thermal Imaging to distinguish obstacles with insufficient contrast.

Experimental constraints and realistic performance boundaries

The reported detection accuracy of 97.0% is to be understood in relation to performance in the controlled indoor environment of range 15.0 m. Due to limited range and a finite environment, neither the 2D LiDAR scanner nor the thermal camera was working in a regime where attenuation is a significant concern. The reported accuracy is a measure of how well the combined system is able to localize a set of predefined obstacles in a predefined environment and is not to be interpreted for a different outdoor scenario. While in a predefined environment, the system worked well.

Statistical robustness of detection accuracy

To assure the accuracy of detection performance, every testing situation under different levels of visibility was assessed five times independently, and the detection accuracy was calculated for each trial. Mean values with standard deviation and 95.0% confidence intervals are represented graphically to give a statistical interpretation of variability. As indicated in Table 2, the variance is smallest for the LiDAR and thermal fusion setup; this verifies that the value of 97.0% accuracy is not an isolated result but an average one. Variations in the detection performance of the thermal imaging camera were higher in the smoke and fog situation because of radiative diffusion; LiDAR detection performance was stable for all five testing trials.

Table 2.

Obstacle detection accuracy across five independent trials (mean ± SD, 95% CI).

Condition	Thermal camera (%)	LiDAR sensor (%)	Fused system (%)
Complete darkness	92 ± 1.2 (CI: ±2.1)	95 ± 0.8 (CI: ±1.4)	97 ± 0.6 (CI: ±1.0)
Smoke-filled	89 ± 1.5 (CI: ±2.5)	95 ± 0.9 (CI: ±1.6)	97 ± 0.7 (CI: ±1.2)
Foggy environment	85 ± 1.8 (CI: ±2.9)	95 ± 1.0 (CI: ±1.7)	97 ± 0.9 (CI: ±1.5)

Navigation performance evaluation

To validate the operational capability of the rover, the baseline traversal times were measured and compared for the thermal-only, LiDAR-only, and fusion approaches under the same conditions. For each of the approaches, five distinct trials were conducted. As shown in Table 3, the thermal-only approach took the longest time to traverse, with the LiDAR-only approach being moderately better but slower in the smoke and fog conditions. The fusion approach resulted in the fastest and stable navigation time, with the 15.0 m obstacle race being completed in an average of 120.0 s. This is thanks to the complementary nature of thermal imaging and LiDAR technology, which, combined, facilitate better detection and planning of traversals through degraded conditions.

Table 3.

Baseline navigation time comparison across five runs (mean ± SD).

Configuration	Avg. time (s)	Std. dev. (s)
Thermal-only	168.0	±13.6
LiDAR-only	142.0	±10.4
Fused system	120.0	±6.8

Results and discussion

The experimental results reveal that the fused thermal and LiDAR perception framework always enjoys an evident advantage over single-modality settings, especially in the most deteriorated environmental conditions of visibility. Roughly showing the tendency for all three testing cases, it is apparent that the combined geometric and radiometric information plays an essential role in maintaining the stability of obstacle detection performance. LiDAR performance will deteriorate with increased beam scattering in dense obscurants, and the performance of the thermal sensor will degrade with low contrasts or without heat signatures for obstacles to stand out against the background. The combined framework remedies the above disadvantages by always providing detection information with at least one sensor functioning well under deteriorated conditions. The performance of the modified YOLOv5 model again demonstrates the necessity of architecture-level adjustments for handling thermal images effectively. The improved multi-scale feature fusion and attention mechanisms of the modified architecture enhance the capability to filter valid thermal features from invalid background features. Especially for cases where obstacles display weak and partially hidden heat patterns, this superiority becomes more apparent. The training process demonstrates stable convergence, but more importantly, the generalization capability of models from different testing cases with low visibility is highlighted here. The model always successfully detects obstacles regardless of dynamic contrasts for thermal patterns and when LiDAR observations become sparse, implying that the fused feature set supplies a more stable input data distribution than any individual sensor modality alone. Navigation results naturally display similar superiority of perception fusion for enhanced navigation performance. Even when one sensor is temporarily degraded, the rover is able to maintain its stable path planning, and it is clear that having ROS’s navigation stack utilize the fused cost map representation is beneficial. The fact that the system is able to maintain stable obstacle avoidance behavior when undergoing repeated trials is indicative of its robust world model provided by its fused perception layer. This is especially useful in embedded hardware systems, which do not have the luxury of being able to run complex fusion frameworks due to computing limitations. Practical considerations are also provided by the experimental results. Since it is able to maintain its performance under controlled smoke and fog scenarios, it is clear that moderate obscurant density levels are able to be handled well by its fusion framework, though high levels and rapidly changing obscurant density levels are still problematic. Moreover, though Raspberry Pi 4 is able to adequately handle real-time inference, it is clear that its computing margin is limited, and if more powerful embedded hardware platforms are developed in the future with marginally better computing abilities, its responsiveness is expected to increase further. It is thus clear that embedded hardware remains a balanced choice between robust computing and its world perception. This system remains more consistently stable compared to its thermal and LiDAR-Only approaches, not only because it is able to produce better numerical scores but also because it is able to do so under varying levels of environment and world uncertainty. This is especially important while attempting to develop an autonomous system that needs to robustly navigate visually impaired settings, and thus it proves an important and useful tool as an embedded hardware system.

Human and obstacle detection performance

In simulated low-visibility scenarios, the system’s fundamental perceptual capacity was evaluated using both human and generic obstacle detection.

Thermal camera performance

The FLIR A65 thermal camera was tested independently. As measured in Table 4, the camera performed particularly well in dark (92.0%) and smoke (89.0%) conditions, taking advantage of the heat signatures. It performed less well in fog (85.0%), due to infrared radiation being scattered, which lowers thermal contrast to produce somewhat blurred images with less well-defined targets.

Table 4.

Detection accuracy of thermal camera.

Condition	Detection rate (%)
Complete darkness	92.0
Smoke-filled room	89.0
Foggy environment	85.0

LiDAR sensor performance

The YD LiDAR X4 sensor yielded stable geometrical data measurements at a 95.0% detection rate in all scenarios (Table 5), thereby proving its adaptability in varying lighting conditions and visual obscurants. Real-time map creation in these experiments used Hector SLAM along with the YD LiDAR X4 sensor output map. Figure 8 below gives a visual demonstration of LiDAR and thermal map generation output in an RViz window.

Table 5.

Detection accuracy under baseline indoor conditions.

Condition	Detection rate (%)
Complete darkness	97.0
Smoke-filled room	97.0
Foggy environment	97.0

Figure 8.

Fused LiDAR–thermal mapping result in RViz.

The active ROS display elements are displayed in the left panel: Grid, Map, Laser-Scan, TF, Path, Pose-Array, Marker-Array, Image, and PointCloud2. The main display window shows the 2D occupancy grid map resulting from the fusion of LiDAR structure data and thermal data. In the map, the free space is marked in white, the obstacles are painted black, and the unknown space is represented in gray. The red marker and the red line display the robot’s current position and its path to the target. This fusion map provides more distinct boundary representations for the obstacles and is less noisy compared to LiDAR-only mapped representations, especially when the map is created in a scenario with lower visibility, thus facilitated by the fusion of thermal data and the degraded data from the LiDAR sensor. Figure 8, is coherent in its display showing the benefits derived from the fusion of the various sensors’ data in relation to providing effective navigation to the robot when the visibility is degraded.

Fused system performance

The detection rate obtained from the sensor fusion system was uniformly higher in all tested cases and equal to 97.0%, as recorded in Table 6. This constitutes a significant enhancement compared to stand-alone sensors. Indeed, by properly overcoming major limitations of single sensors-the inability of the thermal camera to give accurate geometry and the inability for LiDAR to detect objects based on their thermal characteristics-fusion allowed for exact positioning of the thermally detected human in the map created by LiDAR as shown in Figure 8.

Table 6.

Detection accuracy of fused system.

Condition	Detection rate (%)
Complete darkness	97.0
Smoke-filled room	97.0
Foggy environment	97.0

Figure 9(a) to (h) illustrates the full human detection and localization pipeline from the fused LiDAR–thermal stream. Figure 9(a) includes a thermal frame with the YOLOv5 detector marking the human target and bounding box and its associated confidence score indicating reliable detection under poor visibility conditions. Figure 9(b) is the corresponding LiDAR scan, containing geometry of the environment which enables cross modality alignment of thermal detection. Figure 9(c) comprises the fused occupancy grid generated from LiDAR geometry and thermal cues, where the detected human position is projected onto the map. Figure 9(d) presents the estimated robot pose and trajectory in the fused map, indicating how the system keeps track of localization while detecting human presence. Figure 9(e) shows the occupancy grid updated with this detection in the mapping pipeline to represent human position with consistency in the modeled environment. Figure 9(f) refines the localization to solve the exact corresponding grid cell for the detected human, bringing out the preciseness of the projection. Figure 9(g) shows the response of the navigation module where the robot plans a safe path by taking the detected human into consideration, thus showing human aware navigation capability. Finally, Figure 9(h) combines all the components-detection, mapping, localization, and path planning-into a single fused visualization. The system detects, localizes, and reacts to human presence in real-time. Together, Figure 9(a) to (h) gives a complete self-contained view of the proposed fused perception pipeline.

Figure 9.

Human detection with YOLOv5 on fused thermal data (a) Thermal frame with YOLOv5 human detection, (b) LiDAR scan showing scene geometry, (c) Fused occupancy grid with projected human position, (d) Robot pose and trajectory in fused map, (e) Updated occupancy grid with human detection, (f) Refined localization to exact grid cell, (g) Navigation module planning human aware path, (h) Integrated visualization of detection, mapping, localization, and navigation.

Navigation efficiency

In particular, the performance of the entire navigation stack was measured with respect to path planning and dynamic obstacle avoidance.

Path planning performance

The system successfully negotiated the 15 m course on average in 120.0 s (Table 7). The planned trajectories were very efficient, demonstrating the planner to effectively utilize the accurate, fused environmental model to minimize the travel time and distance.

Table 7.

Path planning performance.

Metric	Result
Average time (s)	120.0
Average distance (m)	15.0

Real-time adaptation

Its reactivity, with obstacles introduced in real-time, was also checked. It showed very good real-time performance, as the average path adjustment and re-planning latency was only 0.5 s. The resulting low latency shows that the system can safely navigate through a dynamically changing environment with uncertainties in it. In addition to reporting a 0.5 s average response latency, the real-time performance of the proposed system was quantified in terms of frames per second (FPS). The fused perception pipeline achieved ∼2 FPS on Raspberry Pi 4, with YOLOv5 detection alone reaching ∼3.5 FPS. The entire navigation pipeline including Hector SLAM and Move Base ran at ∼2 FPS end to end. The above results validate that the system runs in real-time with embedded hardware constraints.

System robustness

The integrated system has been subjected to long-term operation with adverse conditions. The system remained highly accurate in detection-each time above 85.0% in the worst case. In addition, the system operated with no failures in all tasks related to navigation. This not only confirms the robustness of the algorithm for the fusion framework but also its stability in hardware integration.

Statistical evaluation of navigation trials

In order to test the repeatability of the system and assess its consistency, several tests were carried out in the same fog density conditions as the indoor navigation experiment in 15.0 m distance. Traversal time, path tracking deviation, and obstacle avoidance performance were recorded in each test run. In addition, there are cases of failure due to dense fog, which causes a drop in the LiDAR sensor for a very short time, leading to hesitation in the path. These observations give insight into the system’s repeatability and point out under what condition its performance may degrade.

Comparative analysis

A comparative analysis was done to put the performance of the proposed system into perspective.

Comparison with single sensors

As shown in Table 8, the fused system surpassed thermal-only and LiDAR-only systems in detection accuracy and navigation performance. The 8.0% gain in detection with respect to the thermal-only system in fog and the 2.0% gain over the very robust LiDAR-only system highlight the benefits of fusion. The more effective navigation is a function of the more detailed and accurate world model possessed by the planner.

Table 8.

Detection accuracy under mixed low-visibility conditions (fog + smoke).

System	Detection rate (%)	Navigation efficiency (time/distance)
Thermal only	89.0	130.0 s/16.0 m
LiDAR only	95.0	125.0 s/15.5 m
Integrated (thermal+LiDAR)	97.0	120.0 s/15.0 m

Comparison with current systems

When compared with systems documented in the literature, it can be seen that the proposed system exhibits remarkably competitive, if not better, performance. The system has a better reported detection accuracy in low-visibility environments than the all-terrain rover presented in Table 1, and offers a more comprehensive solution than the naval robot, which primarily targeted obstacle recognition. The main discriminator is the proven real-time fusion on an embedded hardware platform costing little, providing a strong trade-off among performance, reliability, and affordability. In order to show the competitiveness of the proposed approach, a detailed quantitative comparison with the existing navigation systems currently in use in low-visibility conditions is carried out. In this context, the evaluation of the proposed approach is based on some of the important parameters reported in the literature, such as Absolute Trajectory Error (ATE), navigation success rate, detection accuracy, and complexity of the system. As shown in Table 9, the proposed Thermal-LiDAR fusion approach ensures high detection accuracy (97%), complete navigation success (100%), and real-time processing (0.5 s) with low complexity compared to other approaches.

Table 9.

Quantitative comparison of the proposed thermal–LiDAR fusion framework with state-of-the-art navigation systems under low-visibility conditions.

Study/approach	Reported metric	Limitation	Comparison with proposed work
LST-SLAM (arXiv 2026)⁴⁶	ATE reduced by 75.8% vs AirSLAM; 100% trajectory completion	High computational cost; requires stereo thermal + GPU	Proposed system achieves 97% detection accuracy and 100% navigation success with lower system complexity
Thermal stereo SLAM (2026)⁴⁷	ATE reduced by ∼65%–75%; 100% completion in smoke/fog	Requires dual thermal cameras; GPU processing	Proposed system maintains robustness in smoke/fog with real-time 0.5 s latency
Thermal stereo SLAM (2026)⁴⁸	ATE = 0.21 m; 100% completion	Limited to 10–15 FPS; high hardware requirement	Proposed system achieves comparable robustness with reduced computational burden
Thermal UAV navigation (2026)⁴⁹	90%–93% success in dense smoke; drift >0.4 m	Performance drops at low thermal contrast	Proposed system achieves 97% accuracy and 100% navigation success under similar conditions
Proposed thermal–LiDAR fusion (this work)	Accuracy: 97%; Success: 100%; Latency: 0.5 s	Controlled environment validation	Demonstrates real-time multimodal robustness

As observed from Table 9, existing thermal SLAM systems achieve strong localization accuracy but often require stereo thermal setups and GPU acceleration. UAV-based thermal navigation has lower success rate in dense smoke conditions. On the contrary, the proposed framework has 97% accuracy in detection and 100% success rate in navigation with reduced computational complexity and 0.5 s latency in real-time conditions.

Confusion matrix analysis of sensor systems through machine learning (ML) classification models

In this study, the RF and SVM models were used solely as a basis to test the effectiveness of the combined features in differentiating between obstacle and non-obstacle classes. These traditional machine learning models were only used during the offline analysis stage and not included in the real-time ROS-based framework. The models were used to test the effectiveness of traditional machine learning models in the presence of smoke, fog, and darkness, and to identify the limitations of the models in comparison to the effectiveness of the deep learning-based detector model. The reduced accuracy of the RF and SVM models in the presence of reduced visibility conditions further justified the limitations of traditional models, which led to the choice of the modified YOLOv5 network as the primary perception model in the final embedded hardware system, thus aligning the ML section with the overall methodology and clearly stating the role of the RF and SVM models in the study.

Random forest (RF) classifier

The RF algorithm⁵⁰ is an ensemble learning technique where numerous decision trees are formed during the training phase. It predicts the class that is most common among the classes of the trees, thus reducing the chances of overfitting and making the model more generalizable. The overall process of the RF classification technique used in this research is shown in Figure 10. It begins with feeding the input of the fused feature vector, which comprises spatial features like distance and the shape of the contour, and thermal features like mean temperature and thermal gradient. Next, the training data is bootstrapped, where the data is sampled with replacement. Parallel Tree Construction Next, numerous decision trees are formed on each bootstrap data set. At each node of the trees, a random subset of features is used for selecting the best feature to split the data, ensuring diversity. Ensemble Voting Mode at this step, the new unseen input of the fused feature vector is passed through each tree in the forest, and they vote on the classification result, for example, “Obstacle” or “Non-Obstacle.” Finally, the classification result is obtained through majority voting of the trees in the forest.

Figure 10.

Random forest classification.

Support vector machine (SVM) classifier

The SVM⁵¹ is a powerful discriminative classifier, and the goal is to find the optimal hyperplane that can separate different classes in the high-dimensional feature space. It can be used in scenarios like the one depicted in Figure 11, where the data is not linearly separable. Ultimately, the final system is based on the YOLOv5 model, as it performs best on feeding in direct data like images and also gives bounding boxes. However, the SVM model was very useful during development, as it showed that high-quality classification was possible at a reduced computational cost, which is a very desirable trade-off. It finds the hyperplane where the margin between the two classes, “Obstacle” and “Non-Obstacle,” is maximized. Data points closest to the hyperplane, called the support vectors, play a very important role in the classification. The new feature vector, which is the fusion of the features, is projected into the same high-dimensional space, and based on the classification with respect to the optimized hyperplane, the final classification is done. The key strength of the SVM is the capability to create complex nonlinear decision boundaries, which could prove very useful in differentiating subtle patterns within the multi-modal sensor data. In order to assess the performance of each of the detection systems in terms of classification under several environmental conditions, confusion matrices were developed as displayed in Figure 11, based on TP (True Positive), TN (True Negative), FP (False Positive), and FN (False Negatives). Each class used 1000 test frames for comparison consistency.

Figure 11.

Support vector machine (SVM) classification workflow.

In complete darkness (Figure 12(a)) as listed in Table 10, the overall performance of the fused sensor system was the best with 97.0% accuracy and a 96.9% F1-score, beating the individual Thermal Camera’s 92.0% accuracy and the LiDAR Sensor’s 95.0% accuracy. The fusion strategy yields the optimal balance between precision (97.9%) and recall (96.0%) by dramatically reducing both FP and FN. This proves that sensor fusion improves reliability by 2.0%–5.0% across all important parameters, and the high recall validates its crucial benefit in reducing missed detections in low light.

Figure 12.

Confusion matrices for obstacle detection under three degraded visual conditions: (a) complete darkness, (b) smoke-filled environment, and (c) foggy environment.

Table 10.

Complete darkness.

System	TP	TN	FP	FN	Accuracy (%)	Precision (%)	Recall (%)	F1-score (%)
Thermal camera	460	460	20	60	92	95.8	88.4	92.0
LiDAR sensor	475	475	15	35	95	96.9	93.1	95.0
Fused system	485	485	10	20	97	97.9	96.0	96.9

In the smoky environment depicted in Figure 12(b), the combined system achieved a maximum accuracy of 97.0% or 96.9% in terms of F1-score, which effectively reduced the performance degradation in the standalone thermal camera, which decreased to 89.0% as false negatives increased due to signal scattering, as seen in Table 11. The LiDAR sensor was robust to obstacles with a strong accuracy of 95.0%. The fusion system was able to synergistically use the sensors to overcome the weakness of the thermal sensor by correcting its error rate to a well-balanced accuracy of 97.9% and a recall value of 96.0%. This shows the effectiveness of the system and the need for fusion to maintain a strong detection accuracy even when faced with difficult conditions with particulates.

Table 11.

Smoke-filled environment.

System	TP	TN	FP	FN	Accuracy (%)	Precision (%)	Recall (%)	F1-score (%)
Thermal camera	445	445	25	85	89	94.7	83.9	89.0
LiDAR sensor	475	475	15	35	95	96.9	93.1	95.0
Fused system	485	485	10	20	97	97.9	96.0	96.9

As depicted in Figure 12(c), the combination system worked best under the foggy conditions (Table 12), compensating for the shortcomings of the sensors with a 97.0% accuracy rate and an F1-score value of 96.9%. Among these, the performance degradation of the thermal camera sensor had been severe because the signal was readily absorbed by the water vapor, and it managed to report an accuracy of only 85.0%, with a recall value of 78.0%, while the LiDAR sensor reported an accuracy of 95.0%. In the fusion model, the issue of high false negatives reported by the thermal camera sensor had been compensated for, and the overall accuracy increased to 97.9%. This aptly and conclusively proves the environmental robustness by pointing out how the fusion of the data resulted in the reduction of false positives and false negatives under all the adverse conditions.

Table 12.

Foggy environment.

System	TP	TN	FP	FN	Accuracy (%)	Precision (%)	Recall (%)	F1-score (%)
Thermal camera	425	425	30	120	85	93.4	78.0	85.0
LiDAR sensor	475	475	15	35	95	96.9	93.1	95.0
Fused system	485	485	10	20	97	97.9	96.0	96.9

Overall results demonstrate that the combined sensor system achieved the accuracy, recall, and F1-score metrics of about 98.0%, 96.0%, and 97.0%, respectively, and the performance was consistent for all the tested scenarios regarding low visibility when contrasted to the various sensor settings. Even though the LiDAR-only detection demonstrated extremely stable results, certain limitations were found in the refined identification of the target objects, and the thermal-only sensor was extremely vulnerable to obscurants including fog and smoke. The fusion strategy outlined in the thesis eliminated the weaknesses of the standalone sensors through the fusion of the accurate spatial depth measurements available in the LiDAR sensors combined with the thermal intensity to create an effective environmental perceiving system that avoided both false negatives (FN) and false positives (FP). This confirms that sensor fusion is critical for reliable and robust autonomous navigation under unfavorable conditions. Figure 13 shows the performance comparison of Thermal Camera, LiDAR Sensor, and Fused System under complete darkness, smoke-filled, and foggy conditions. The fused system always has the highest detection accuracy of 97.0% in all conditions. LiDAR maintains stable performance with 95.0%, while a thermal camera shows reduced accuracy in low-visibility conditions. This confirms that sensor fusion significantly improves the reliability of detection in challenging environments.

Figure 13.

Comparative bar chart showing the detection accuracy of the thermal camera, LiDAR sensor, and fused system under different environmental conditions.

Performance comparison analysis

Assessment of the suggested Thermal-LiDAR fusion-based navigation system’s performance in comparison to the most advanced techniques available in 2025 in low-visibility situations is listed in Table 13.

Table 13.

Comparative summary of recent real-time embedded hardware fusion systems evaluated across thermal, LiDAR, depth, and multi-modal sensing pipelines.

Sensors/model	Platform/testbed	Test environment	Reported detection/localization/accuracy	Navigation success/field test	Reaction/latency	Key claim
Thermal + LiDAR fused with EKF/SLAM (2025)⁷	Research SLAM robot/tunnel testbed	Tunnels, dusty/low-light conditions	Localization error and higher localization robustness vs LiDAR	Sustained localization where LiDAR	Localization rather than reaction latency	Thermal + LiDAR fusion
Thermal + LiDAR depth completion network (2025)⁵²	Dataset/sensor rigs (day/night)	All-day conditions including low light and degraded visibility	Depth completion accuracy RMSE/IoU	Dense depth and perception under adverse lighting	NA	Depth-completion to make LiDAR usable “all-day”
LiDAR + stereo thermal fusion for indoor disaster environments (2023)⁵³	Indoor ground robot with OS1-64 LiDAR + stereo thermal	Dense fog, smoke-like indoor disaster conditions	Fused point cloud maintains mapping when LiDAR-only fails	Successful mapping in dense fog; No autonomous navigation	Real-time fusion; No obstacle reaction latency reported	Fog-robust LiDAR–thermal fusion for indoor disaster mapping
Stereo thermal + optical + MEMS LiDAR + IMU dataset (NIGEL; 2025)¹⁵	Vehicle-mounted multi-sensor rig	Urban + unstructured (forest), day/evening/night	High-quality calibrated streams; 6-DoF ground truth; no detection metric	Dataset only (No navigation pipeline)	Sensor logging at 20 Hz (thermal/optical) and ∼5 Hz (LiDAR)	First dataset combining stereo thermal + MEMS LiDAR for navigation research
RGB + Thermal + LiDAR deep fusion (YOLOv8 + PointPillars; 2025)⁵⁴	Automotive roof-mounted sensor array	Night, smoke, fog, dust, heavy rain	90%–96% detection accuracy; improved SSIM/VIF/MSE vs baselines	Real-vehicle tests; No closed-loop navigation	∼0.02 s per fused frame (∼50 FPS)	Deep-learning fusion improves AV perception in extreme weather
Proprioceptive multi-sensor fusion for legged locomotion (RL; 2025)⁵⁵	Quadruped robot (KAIST HOUND/HOUND2)	Rough terrain, slopes, stairs	Velocity-tracking RMSE <0.1 m/s; no exteroceptive detection	Successful rough-terrain locomotion; step climbing up to 67 cm	100 Hz control loop	Barrier-based RL enables diverse locomotion modes
Thermal camera + 2D LiDAR fused + YOLOv5 (ML fusion; this work)	Raspberry Pi 4, ROS-based navigation stack	Darkness, smoke, fog; indoor/outdoor low-visibility	97% peak obstacle-detection accuracy; up to 13% improvement in fog vs single sensors	100% navigation success on 15 m course under obscurants	0.5 s dynamic-obstacle reaction latency	Low-cost, real-time thermal–LiDAR fusion enabling robust navigation on embedded hardware hardware

Compared with the recent embedded hardware fusion systems summarized in Table 13, the proposed thermal–LiDAR fusion framework demonstrates a uniquely strong balance of accuracy, robustness, and real-time autonomy on extremely low-power hardware. Dataset-oriented works such as NIGEL provide rich multi-sensor recordings but do not report real-time detection, obstacle response, or closed-loop navigation. Indoor disaster-response systems maintain mapping performance under fog but do not achieve autonomous navigation or dynamic-obstacle handling. Automotive deep-learning fusion frameworks achieve high perception accuracy but rely on GPU-accelerated hardware and do not operate on lightweight embedded hardware platforms. Legged-robot fusion frameworks focus on proprioceptive control rather than exteroceptive sensing in obscurants. In contrast, the proposed system delivers 97.0% obstacle-detection accuracy, 100.0% navigation success, and a 0.5 s reaction latency in smoke, fog, and darkness-while running entirely on a Raspberry Pi 4 without external accelerators. This combination of real-time perception, robust performance in obscurants, and fully integrated autonomous navigation on a low-cost embedded hardware platform is not achieved by any of the existing systems, positioning this work as a practical and efficient solution for visibility-degraded environments. While the experimental evaluation confirmed high detection accuracy and reliable navigation across smoke, fog, and darkness, it is equally important to recognize the boundaries of the current system. The next section discusses the current limitations, showing the improvements while at the same time pointing out the challenges that have been addressed and those which are still there but present an opportunity for further development.

Limitations and future work

Although the proposed system has shown the effectiveness in tackling some of the critical issues in low-visibility navigation, some limitations still exist:

Sensor-Level Constraints: Fusion of LiDAR geometry with thermal intensity reduces individual weaknesses, but extreme fog scattering and low thermal contrast outdoors are still challenging.

Computational Efficiency: Real-time operation on Raspberry Pi 4 proves feasibility for embedded hardware deployment, yet more advanced fusion models remain impractical on low-power hardware.

Experimental Scope: Though the proposed system has shown its robustness in controlled tests conducted in darkness, fog, and smoke, outdoor tests are still needed.

Integration and Stability: While the proposed system has successfully integrated the sensing, fusion, SLAM, and navigation techniques into a ROS framework, the problem of calibration drift and domain adaptation still needs to be addressed.

Multi-Modal Expansion: Even if the proposed system has shown its effectiveness in comparison with the state-of-the-art methods that utilize only thermal and LiDAR sensors, the incorporation of radar and multi-spectral sensors could further improve the perception capability

Future work

Based on the current findings, the following directions may be considered to extend the framework:

Enhanced Sensor Fusion: Adding other modalities, such as radar or multi-spectral imaging, may enhance robustness in the most extreme obscurants, where the fusion of thermal and LiDAR is not sufficient.

Advanced Fusion Models: Investigating lightweight transformer or attention-based fusion architectures that are optimized for embedded devices may allow for higher accuracy without exceeding computational limitations.

Broader Field Validation: Expanding the experiments to various outdoor environments, industrial settings, and dynamic terrains will give a better idea about the applicability of the experiments in the real world.

Long-Term Deployment Studies: More experiments can be conducted to assess the calibration stability, sensor drifts, and domain adaptation for reliable performance in continuous operations in the field.

Scalability and Multi-Robot Systems: Future work can be done to explore distributed fusion architectures with multiple robot platforms for cooperative navigation.

Dataset Expansion: More fused datasets can be created for various domains with different obscurant conditions to provide diversity to the detection model training.

In summary, it is evident that the limitations identified present the scope and scope of opportunity for the existing framework. Furthermore, it is evident from the proposed directions for future work that there is an attempt to take the robustness, scalability, and flexibility of thermal and LiDAR fusion beyond the laboratory and toward more reliable implementations of autonomous navigation.

Conclusion

This work proposed a ROS based intelligent navigation framework that utilizes thermal imaging and 2D LiDAR sensing modalities for autonomous robotic navigation in low visibility conditions such as darkness, smoke, and fog. This work has shown the efficient and robust real-time performance of the proposed framework on low-power embedded platforms (Raspberry Pi 4). The contributions of this study include the design of an embedded hardware algorithm that improves perception in low visibility conditions, the development of an optimized ROS framework that can handle data-intensive and time-critical operations, and the implementation of a lightweight embedded hardware architecture that improves flexibility, robustness, and adaptability of the proposed framework on different robotic platforms. However, despite all these advantages, there are a number of challenges that exist with the use of thermal sensors, especially in the long term, as there is a possibility of thermal drift in pixel level stability. In addition, there are computational limitations with the Raspberry Pi 4 that limit the use of high frame rates and deeper learning models. Moreover, though the model has been validated in indoor conditions, it needs to be further validated in outdoor conditions on a large scale with uncontrolled obscurants and terrain.

Footnotes

ORCID iD

Ievgen Zaitsev

Ethical considerations

Not applicable.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Author contributions

Poundoss Chellamuthu, R. Krishnamoorthy: Conceptualization, Methodology, Software, Visualization, Investigation, Writing - Original draft preparation. T. Yuvaraj: Data curation, Validation, Supervision, Resources, Writing - Review & Editing. Mohit Bajaj, Ievgen Zaitsev: Project administration, Supervision, Resources, Writing - Review & Editing.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data availability statement

All relevant data are within the manuscript. The collection and analysis method complied with the terms and conditions for the source of the data. The datasets used and/or analyzed during the current study available from the corresponding author on reasonable request.

References

Wang

Zhang

Luo

, et al. An end-to-end point-based method and a new dataset for street-level point cloud change detection. IEEE Trans Geosci Remote Sens 2023; 61: 1–15.

Higgins

Bezzo

. Negotiating visibility for safe autonomous navigation in occluding and uncertain environments. IEEE Robot Autom Lett 2021; 6(3): 4409–4416.

Choi

Kim

. A sensor fusion system with thermal infrared camera and LiDAR for autonomous vehicles and deep learning-based object detection. ICT Express 2023; 9(2): 222–227.

Jiang

Zhu

Yang

. Research on low-light environment SLAM system via retinex enhancement and deep feature optimization. IEEE Sens J 2025; 25: 29992–30004.

Xiang

Feng

Xie

, et al. Multi-sensor fusion and cooperative perception for autonomous driving: a review. IEEE Intell Transp Syst Mag 2023; 15(5): 36–58.

Singh

BSB

. Autonomous driving in adverse weather: a multi-modal fusion framework with uncertainty-aware learning for robust obstacle detection. Int J Adv Comput Sci Appl 2025; 16(8): 81–86.

Schichler

Festl

Solmaz

, et al. Thermal-LIDAR fusion for robust tunnel localization in GNSS-denied and low-visibility conditions. In: 2025 IEEE International automated vehicle validation conference (IAVVC), 30 September 2025, pp. 1–6. IEEE.

Somvanshi

Sheley

Shuvo

, et al. A survey on automated vehicles in low visibility and infrastructure-limited roadway settings. SSRN 5387394. (2025, accessed 1 August 2025)

Levering

Tomko

Tuia

, et al. Detecting unsigned physical road incidents from driver-view images. IEEE Trans Intell Vehicles 2020; 6(1): 24–33.

10.

Liao

Xie

Geiger

. KITTI-360: a novel dataset and benchmarks for urban scene understanding in 2D and 3D. IEEE Trans Pattern Anal Mach Intell 2023; 45(3): 3292–3310.

11.

Goberville

El-Yabroudi

Omwanas

, et al. Analysis of LiDAR and camera data in real-world weather conditions for autonomous vehicle operations. SAE Int J Adv Curr Pract Mobil 2020; 02(5): 2428–2434.

12.

Zhang

Wang

Cai

, et al. Robust-FusionNet: deep multimodal sensor fusion for 3-D object detection under severe weather conditions. IEEE Trans Instrum Meas 2022; 71: 1–13.

13.

Luong

Markin

Larson

. Dataset for fusion of stereo thermal, optical, scanning mirror LiDAR, and inertial sensors for navigation in urban and unstructured environments. In: 2025 IEEE/ION position, location and navigation symposium (PLANS), 28 April 2025, pp. 146–156. IEEE.

14.

Schichler

Festl

Solmaz

. Robust multi-sensor fusion for localization in hazardous environments using thermal, lidar, and GNSS data. Sensors 2025; 25(7): 2032.

15.

Wen

, et al. The research on infrared radiation affected by smoke or fog in different environmental temperatures. Sci Rep 2024; 14(1): 14410.

16.

Gené-Mola

Llorens

Rosell-Polo

, et al. Assessing the performance of RGB-d sensors for 3d fruit crop canopy characterization under different operating and lighting conditions. Sensors 2020; 20(24): 7072.

17.

Hevisov

Liemert

Reitzle

, et al. Impact of multi-scattered lidar returns in fog. Sensors 2024; 24(16): 5121.

18.

Haghbayan

Farahnakian

Poikonen

, et al. An efficient multi-sensor fusion approach for object detection in maritime environments. In: 2018 21st international conference on intelligent transportation systems (ITSC), 4 November 2018, pp. 2163–2170. IEEE.

19.

Fritsche

Zeise

Hemme

, et al. Fusion of radar, LiDAR and thermal information for hazard detection in low visibility environments. In: 2017 IEEE international symposium on safety, security and rescue robotics (SSRR), 11 October 2017, pp. 96–101. IEEE.

20.

Geiger

Lenz

Urtasun

. Are we ready for autonomous driving? The KITTI vision benchmark suite. In: 2012 IEEE conference on computer vision and pattern recognition, 16 June 2012, pp. 3354–3361. IEEE.

21.

Chandra

Navneet

Zhang

. The hybrid multimodal graph index (HMGI): a comprehensive framework for integrated relational and vector search. arXiv preprint arXiv2510.10123 (2025, accessed 11 October 2025)

22.

Liu

, et al. Domain adaptation-based object detection for autonomous driving in foggy and rainy weather. IEEE Trans Intell Vehicles 2025; 10: 900–911.

23.

Zhang

, et al. Point cloud processing under adverse weather: a survey of datasets, enhancement, and denoising methods. Signal Image Video Process 2025; 19(9): 718.

24.

Konieczna-Fuławka

Koval

Nikolakopoulos

, et al. Autonomous mobile inspection robots in deep underground mining - the current state of the art and future perspectives. Sensors 2025; 25(12): 3598.

25.

Habeeb

Almistarihi

Smerat

, et al. Autonomous robot navigation system based on double deep Q-network. In: Abualigah

(ed.) Mastering the minds of machines. Boca Raton, FL: CRC Press, pp.178–186.

26.

Abu-Khadrah

Al-Qerem

Hassan

, et al. Drone-assisted adaptive object detection and privacy-preserving surveillance in smart cities using whale-optimized deep reinforcement learning techniques. Sci Rep 2025; 15(1): 9931.

27.

Ashqar

Jaber

Alhadidi

, et al. Advancing object detection in transportation with multimodal large language models (MLLMs): a comprehensive review and empirical testing. Computation 2025; 13(6): 133.

28.

Ashqar

Alhadidi

Elhenawy

, et al. Multimodal large language models (MLLMs) for object detection from thermal images: initial experiments. In: 2025 International conference on smart applications, communications and networking (SmartNets), 22 July 2025, pp. 1–5. IEEE.

29.

Alhadidi

Jaber

Jaradat

, et al. Object detection using oriented window learning vision transformer: roadway assets recognition. In: International conference on intelligent systems, blockchain, and communication technologies, 13 July 2024, pp. 506–522. Cham: Springer Nature Switzerland.

30.

Ashqar

Alhadidi

Elhenawy

, et al. Leveraging multimodal large language models (MLLMs) for enhanced object detection and scene understanding in thermal images for autonomous driving systems. Autom 2024; 5(4): 508–526.

31.

Elhenawy

Ashqar

Rakotonirainy

, et al. Vision-language models for autonomous driving: clip-based dynamic scene understanding. Electronics 2025; 14(7): 1282.

32.

Saha

Dhara

. 3D LiDAR-based obstacle detection and tracking for autonomous navigation in dynamic environments. Int J Intell Robot Appl 2024; 8(1): 39–60.

33.

Niskanen

Duan

Vartiainen

, et al. Enhancing point cloud data fusion through 2D thermal infrared camera and 2D lidar scanning. Infrared Phys Technol 2024; 141: 105432.

34.

Rusu

. Semantic 3D object maps for everyday manipulation in human living environments. Künstl Intell 2010; 24(4): 345–348.

35.

Himmelsbach

Hundelshausen

Wuensche

. Fast segmentation of 3D point clouds for ground vehicles. In: Proceedings of the 2010 IEEE intelligent vehicles symposium, 21 June 2010, pp. 560–565. IEEE.

36.

Besl

McKay

. Method for registration of 3-D shapes. In: Proceedings of the SPIE sensor fusion IV: control paradigms and data structures, 1992, vol. 1611, pp. 586–606.

37.

Vollmer

. Infrared thermal imaging. In: Vollmer

(ed.) Computer vision: a reference guide. Cham, Switzerland: Springer International Publishing, 2021, pp. 666–670.

38.

Shahid

Trinh

, et al. Real-time implementation of human detection in thermal imagery based on CNN. 한국정보기술학회논문지 2019; 17(1): 107–121.

39.

Davis

Sharma

. Background-subtraction in thermal imagery using contour saliency. Int J Comput Vis 2007; 71(2): 161–181.

40.

Levinson

Thrun

. Automatic online calibration of cameras and lasers. Robot Sci Syst 2013; 2(7): 1–8.

41.

Premebida

Ludwig

Nunes

. LIDAR and vision-based pedestrian detection system. J Field Robot 2009; 26(9): 696–711.

42.

Chen

Fragonara

Tsourdos

. RoIFusion: 3D object detection from LiDAR and vision. IEEE Access 2021; 9: 51710–51721.

43.

Zhang

Singh

. LOAM: LiDAR odometry and mapping in real-time. Robot Sci Syst 2014; 2(9): 1–9.

44.

Rehder

Nikolic

Schneider

, et al. Extending KALIBR: Calibrating the extrinsics of multiple IMUs and of individual axes. In: 2016 IEEE international conference on robotics and automation (ICRA),16 May 2016, pp. 4304–4311. IEEE.

45.

Dalirani

El-Sakka

. Extrinsic calibration of thermal camera and 3D LiDAR sensor via human matching in both modalities during sensor setup movement. Sensors 2024; 24(2): 669.

46.

Jiang

Chen

. LST-SLAM: a stereo thermal SLAM system for kilometer-scale dynamic environments. arXiv preprint arXiv:2602.20925 (2026, accessed 24 February 2026)

47.

Tao

Anh

, et al. A lightweight range-image network for LiDAR point-cloud denoising in adverse weather. IEEE Access 2026; 14: 17021–17030.

48.

Guo

Sun

Zhou

, et al. Robust UAV tracking under low-SNR via TRSCE-based speckle suppression in vibrating-MMF single-photon LiDAR. Photonics Res 2026; 14(2): 570–586.

49.

Naveen

Kumar

Kalyani

TVS

, et al. Advanced artificial intelligence-based multi-sensor fusion for environmental perception in autonomous electric vehicles. Bull Electr Eng Inform 2026; 15(1): 124–148.

50.

, et al. Intelligent fault diagnosis for large-scale rotating machines using binarized deep neural networks and random forests. IEEE Trans Autom Sci Eng 2022; 19(2): 1109–1119.

51.

Guido

Ferrisi

Lofaro

, et al. An overview on the advancements of support vector machine models in healthcare applications: a review. Information 2024; 15(4): 235.

52.

Kim

Kweon

Park

, et al. All-day depth completion via thermal-LiDAR fusion. arXiv preprint arXiv:2504.02356. (2025, accessed 3 April 2025).

53.

Rho

Park

Pyo

, et al. LiDAR-stereo thermal sensor fusion for indoor disaster environment. IEEE Sens J 2023; 23(7): 7816–7827.

54.

Nawaz

Khan

Daud

, et al. Improving autonomous vehicle cognitive robustness in extreme weather with deep learning and thermal camera fusion. IEEE Open J Veh Technol 2025; 6: 426–441.

55.

Kim

Lee

Park

. A learning framework for diverse legged robot locomotion using barrier-based style rewards. In: 2025 IEEE international conference on robotics and automation (ICRA), 19 May 2025, pp. 10004–10010. IEEE.

Feature-level thermal–LiDAR fusion with deep learning for real-time embedded autonomous navigation in low-visibility environments

Abstract

Keywords

Introduction

Related work

Vision-based navigation under adverse conditions

Thermal-based perception systems

LiDAR-based approaches

LiDAR–thermal fusion strategies

Research gap in embedded hardware thermal–LiDAR fusion frameworks

Novelty of the proposed system

Bridging the computational efficiency gap

Transitioning from algorithm to integrated system

Addressing the perceptual shortfall in obscurants

Providing quantitative comparative validation

Demonstrating real-time performance on accessible hardware

Proposed system design

Standard architecture of existing LiDAR-thermal navigation systems

Fusion of LiDAR and thermal imaging

SLAM method used

Embedded hardware of LiDAR geometric descriptors and thermal intensity

Implementation of the ROS-based autonomous system

Extrinsic calibration between LiDAR and thermal camera

Software architecture and data processing pipeline

Experimental environment, obstacle diversity, and accuracy results

Robustness and navigation performance

Feature extraction from LiDAR and thermal imaging

LiDAR feature extraction

Thermal image feature extraction

Cross-sensor alignment and feature fusion

System integration

Sensor calibration

Projection-based fusion of 2D LiDAR scans with thermal pixels

Fused-perception contribution versus planner novelty

Mathematical formulation of the proposed thermal–LiDAR fusion framework

Sensor data representation

Temporal and spatial alignment

Feature extraction

Embedded hardware

Learning-based perception and navigation integration

Experimental analysis

YOLOv5 configuration and training

Experimental setup, model training and performance analysis

Environmental setup and control

Machine learning training and evaluation setup

Obstacle detection accuracy

Experimental constraints and realistic performance boundaries

Statistical robustness of detection accuracy

Navigation performance evaluation

Results and discussion

Human and obstacle detection performance

Thermal camera performance

LiDAR sensor performance

Fused system performance

Navigation efficiency

Path planning performance

Real-time adaptation

System robustness

Statistical evaluation of navigation trials

Comparative analysis

Comparison with single sensors

Comparison with current systems

Confusion matrix analysis of sensor systems through machine learning (ML) classification models

Random forest (RF) classifier

Support vector machine (SVM) classifier

Performance comparison analysis

Limitations and future work

Future work

Conclusion

Footnotes

ORCID iD

Ethical considerations

Consent to participate

Consent for publication

Author contributions

Funding

Declaration of conflicting interests

Data availability statement

References