Visual–inertial fusion-based registration between real and synthetic images in airborne combined vision system

Abstract

Combined vision system is a perspective display concept to enhance a situation awareness of the pilots during aircraft landing, which integrates a real 2-D image captured from forward-looking infrared camera with a synthetic 3-D image derived by the aircraft pose and terrain database. However, the inertial measured errors significantly affect the conformal display of combined vision. This article proposes a novel method for real and synthetic images registration based on visual–inertial fusion. It includes the following key steps: (1) detect and extract the real runway features from forward-looking infrared image; (2) generate the synthetic runway features simultaneously; (3) set up vision measurement model with real and synthetic runway features; (4) integrate inertial data and visual observations in the square-root unscented Kalman filter; (5) create a synthetic 3-D scene by the filtered pose data and integrate it with a real 2-D image. The experimental results demonstrate that our method can guarantee the conformal display of combined vision system in GPS-denied and low visibility conditions.

Keywords

Combined vision system conformal display runway detection visual–inertial fusion GPS-denied low visibility

Introduction

The landing is the most accident-prone flight stage for the fixed-wing aircrafts since it needs the aircraft to rapidly descend and brake in a narrow airspace. The flight accidents in landing phase account for more than 70% of the total flight period accidents worldwide, of which 30% of all fatal accidents are categorized as controlled flight into terrain (CFIT) due to the lack of outside visual reference and situation awareness.¹ Recently with the rapid development of image processing and infrared sensing, they have been applied to airborne cockpit electronic system to improve flight safety especially in GPS-denied and low visibility conditions. As a novel airborne assistant landing means, combined vision system (CVS) can provide an equivalent visual operation ability for the crew with a perspective flight scene² during landing. It integrates the real 2-D image captured by forward-looking infrared (FLIR) camera and the synthetic 3-D image derived from the aircraft pose and the terrain database,³ then the superimposed image is displayed on a primary flight display to enhance flight visibility of the crew. It is remarkable that the real 2-D image must be conformal to the synthetic 3-D terrain in the combined vision in real time. Otherwise, it will mislead the crew to cause flight accident. However, it is difficult to directly align the FLIR image with the synthetic terrain due to some inevitable errors.

The real 2-D image and the synthetic 3-D image can be understood as two independent perspective images relative to the same physical scene, and they are photographed at the same time by the same camera under the measured pose and the real pose. When the measured pose is not equal to the real pose, the deviations between them lead to obvious misalignments between the real image and the synthetic image. The main reasons for the deviations include the sensors installation errors, image processing delay, terrain data errors, and inertial navigation system (INS) measurement errors.⁴ The deviations can be partially eliminated by relative pose calibration,^5
–7 time synchronization,^8
–10 and high precision terrain database. However, the measured pose errors derived from the random drifts of accelerometers and gyroscopes are the critical issues to be solved.

In recent years, researchers in Honeywell adopted the integrated navigation of laser INS and local area augmentation system to support alignment of real and synthetic images.¹¹ Horn in German Aerospace Center (DLR) used fuzzy logic-based method.¹² The researchers in Russia State Research Institute of Aviation Systems (GosNIIAS) proposed a morphological registration method,^13,14 while the researchers in India National Aerospace Laboratory (CSIR-NAL) used differential GPS (DGPS) instruments to provide accurate navigation and positioning for ESVS system.¹⁵ However, these traditional methods rely heavily on high-precision INS and ground-based augmentation system (GBAS), they can neither be used to low-cost general aviation nor robustly run in GPS-denied condition. Besides, visual cues implied in FLIR images are not fully excavated and utilized. Consequently, how to satisfy the accurate registration of real and synthetic images in CVS with low cost and high robustness has become an important topic.

An alternative solution to achieve aircraft landing pose is vision-aided navigation, which is characterized by low cost, autonomy, and high precision. In recent years, many vision-based motion estimation methods for the fixed-wing aircraft landing have been proposed, which can be divided into two categories, the ground-based methods^16

–19 and the airborne-based methods.^20

–24 The former often utilizes cameras located on the ground to detect, track, and position the aircraft, which has high positioning accuracy. However, these methods cannot estimate the attitude of the aircraft. The latter fully utilizes the airborne sensors and on-board visual navigation algorithms to achieve autonomous navigation. Gui et al.²⁰ proposed an accuracy vision navigation method for UAV landing based on artificial markers. Nevertheless, this method needs to place four infrared lamps on the runway in advance. Cai et al.²¹ adopted square root unsent Kalman filter (SR-UKF) to fuse homography matrix and inertial measurements to implement the UAV auto landing navigation. However, it is very difficult to satisfy the precondition of this method that the world coordinates of the point array are known. Fan et al.²² employed spectral residual saliency map to detect region of interest (ROI), then select sparse coding and spatial pyramid matching to recognize runway and use orthogonal iteration to estimate position and attitude. Due to the low accuracy of runway detection, this method cannot achieve precise motion estimation of the aircraft. Ruchanurucks et al.²³ used Efficient Perspective-n-Point solution to estimate relative pose for an automatic landing-aided system for landing a fixed-wing UAV on the runway. Nevertheless, the accuracy of this method is more susceptible to the errors of image detection. Patruno et al.²⁴ proposed a landing approach based on human-made land marker for multi-rotor unmanned aerial vehicle. It is not suitable for approach and landing navigation of fixed-wing aircraft. Gibert et al. in Airbus²⁵ designed two nonlinear observers based on high gain approach and sliding mode theory and apply them to a vision-based landing solution for civil aircraft in an unknown runway. However, the update rate of aircraft states is too low to satisfy system requirements. As the basis of visual landing navigation, the results of runway detection can affect the accuracy of pose estimation directly. Wu et al.²⁶ adopted a fast line segment detector (LSD)²⁷ to detect the line segments and used regional self-similarity and contextual information to recognize the runway. However, it can be distracted by the edges of rivers, roads, and taxiways easily. Liu et al.^28,29 designed and improved a method based on multi-sensor fusion to realize real-time runway detection. Although it runs very fast, it cannot accurately extract runway features. Moore et al. ³⁰ utilized an image matching pipeline to determine runway by comparing with many stored runway images. Although the above methods have achieved remarkable progress in vision landing navigation, they cannot provide accurate aircraft pose parameters with high update rate to support registration of 2-D–3-D images in low visibility condition.

For the purpose of implementing the accurate registration between real 2-D image and synthetic 3-D image in CVS, a novel method based on visual–inertial fusion framework is proposed in this article. Firstly, an existing runway detection method³¹ is improved to accurately detect and extract three vertexes (the front left corner, the front right corner, and the vanish point of the runway) of runway triangle contour from FLIR images instead of general four corners or four edges. Simultaneously, synthetic runway features are derived by runway geo-information and aircraft’s pose parameters. Secondly, we propose to use real and synthetic runway features to create vision cues and integrate them with inertial data in SR-UKF³² to estimate motion errors. Meanwhile, the measured motion states are corrected with the estimated state errors. Thirdly, the synthetic 3-D scene is generated by the corrected pose data and integrated with real 2-D FLIR image. Meanwhile, the airworthiness requirement of the CVS image conformality² is transformed into the pixel deviation between real and synthetic images in row and column directions to be convenient for verification. Finally, we design a flight data acquisition platform equipped on a general aircraft. The proposed method is proven to be able to guarantee the conformal display by real flight data.

The article is organized as follows: the second section analyzes the landing phase by CVS, defines reference frames, and proposes the algorithm framework. The third section presents the visual–inertial fusion method in detail. The experimental results are shown in the fourth section. Finally, the conclusion and future work are drawn in the fifth section.

System overview

Landing by CVS

During the landing, pilots follow the instrument flight rules before manipulating the aircraft to a decision height (DH = 60.96 m). If visual landing conditions are met at DH, the aircraft will continue to land, otherwise it will be pulled up at once. The pilot follows the visual flight rules from DH to the ground. With the help of CVS, the pilot can decide to land or pull up before descending to 30.48 m in low visibility conditions, which can extend the decision time and increase the landing probability. As shown in Figure 1, during the period of aircraft descending from 152.4 to 30.48 m, CVS not only provides the perfect registration of real and synthetic images but also achieves the accurate pose estimation of the aircraft.

Figure 1.

Approach and landing by CVS. CVS: combined vision system.

Reference frames

In the proposed method, an FLIR camera and an INS are installed on the aircraft. As shown in Figure 2, these reference frames obey to the right-hand rule in this article.

Figure 2.

Reference frames and runway model. INS: inertial navigation system.

{D} represents the geodetic reference frame, any point i in {D} is ${}^{D}P_{i}$ . {E} is the earth centered earth fixed (ECEF) reference frame, a point i in {E} is ${}^{E}P_{i} \in ℜ^{3}$ . {G} denotes the geographic reference frame, a point i in {G} is ${}^{G}P_{i} \in ℜ^{3}$ . {N} is the navigation reference frame, which is east-north-up (E-N-U) direction in this article. A point i in {N} is ${}^{N}P_{i} \in ℜ^{3}$ . {B} represents the body reference frame. Its origin is at the center of INS, X_B-axis points toward right, Y_B-axis points toward front of body, and Z _B-axis points upward. A point i in {B} denotes ${}^{B}P_{i} \in ℜ^{3}$ . {C} is the camera reference frame with the origin O_C at the camera optical center. Z_c-axis points toward front, X_c-axis points to column scan direction, while Y_c-axis faces to row scan direction. A point i in {C} denotes ${}^{C}P_{i} \in ℜ^{3}$ . {P} denotes the pixel reference frame with its origin O_p located at the upper left of image plane. u-Axis points toward row scan direction, and v-axis points toward column scan direction. A point i in {P} denotes ${}^{P}P_{i} \in ℜ^{3}$ .

Algorithm framework

The foundation of the proposed image matching approach is the SR-UKF that integrates inertial data with accurate vision observations and runway geographic information. All input images are captured and synchronized with the inertial data by unified time stamp. In the SR-UKF, the processing model is the error propagation equations of INS, while the vision measurement model comes from real and synthetic runway features. Thus, the 3-D synthetic scene is derived by the filtered pose parameters and airport terrain database, and it can be strictly aligned with the real image. As shown in Figure 3, the part in red box is the core of this algorithm framework.

Figure 3.

Algorithm framework for conformal display in CVS. CVS: combined vision system; INS: inertial navigation system; SR-UKF: square-root unscented Kalman filter.

Real and synthetic images registration

Process model

Firstly, the system state is defined as follows

X^{T} = [\begin{matrix} ψ^{T} & δ v^{T} & δ p^{T} & ε^{T} & \nabla^{T} \end{matrix}]

where $ψ^{n} \in ℜ^{3}$ , $δ v^{n} \in ℜ^{3}$ , and $δ p^{n} \in ℜ^{3}$ are the attitude, velocity, and position error of INS, respectively. $ε^{n} \in ℜ^{3}$ denotes the gyroscope drift, and $\nabla^{n} \in ℜ^{3}$ represents the accelerometer bias. Then, the continuous-time system process model is given

\dot{x} (t) = A (t) \cdot x (t) + w (t)

where $A = [\begin{matrix} 0_{3 \times 3} & I_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 3} \\ [f_{n} \times] & 0_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 3} & C_{b}^{n} \\ 0_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 3} & - C_{b}^{n} & 0_{3 \times 3} \\ 0_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 3} \\ 0_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 3} \end{matrix}]$ , $w = {[\begin{array}{l} ε^{n} & \nabla^{n} & 0_{1 \times 3} & w_{g} & w_{a} \end{array}]}^{T}$

Considering the discrete-time, the model can be written as follows

x_{k} = Φ_{k / k - 1} x_{k - 1} + w_{k - 1}

where

Φ_{k / k - 1} = e^{\int_{t_{k - 1}}^{t_{k}} A (τ) d τ} \approx e^{A (t_{k - 1}) Δ t} \approx I + A (t_{k - 1}) Δ t

Measurement model

Synthetic runway features

One point ${}^{D}P_{f} \in ℜ^{3}$ on the airport runway can be projected onto the pixel plane as a point ${}^{P}P_{f} \in ℜ^{2}$ . This vision projection process involves five transformations as follows.

1. From {D} to {E}

The geodetic coordinate can be transformed into the ECEF coordinate, as follows

{}^{E}P_{i} = {[\begin{matrix} (R_{N} + h_{i}) \cdot cos L_{i} \cdot cos λ_{i}, & (R_{N} + h_{i}) \cdot cos L_{i} \cdot sin λ_{i}, & ((1 - e^{2}) \cdot R_{N} + h_{i}) \cdot sin L_{i} \end{matrix}]}^{T}

where $L_{i}$ , $λ_{i}$ , and $h_{i}$ represent longitude, latitude, and altitude of point ${}^{D}P_{i}$ respectively, R_N is the normal radius of earth, and e denotes the eccentricity of earth.

2. From {E} to {G}

Any known point ${}^{E}P_{i} \in ℜ^{3}$ in the ECEF can be projected into the geographic coordinate system with the INS center as its origin.

{}^{G}P_{i} = [\begin{matrix} - sin L_{a} \cdot cos λ_{a} & - sin L_{a} \cdot sin λ_{a} & cos L_{a} \\ - sin λ_{a} & cos λ_{a} & 0 \\ - cos L_{a} \cdot cos λ_{a} & - cos L_{a} \cdot sin λ_{a} & - sin L_{a} \end{matrix}] \cdot ({}^{E}P_{i} - {}^{E}P_{a})

where $[\begin{matrix} L_{a} & λ_{a} & h_{a} \end{matrix}]$ is the coordinate of aircraft in {D}. To simplify the calculation, the geographic coordinate system {G} is selected as the navigation coordinate system {N}.

3. From {N} to {B}

The navigation coordinate system {N} has the same origin with the body coordinate system {B}, the former rotates yaw-pitch-roll angle round X _N-Y _N-Z _N axis to the latter in sequence, as follows

{}^{B}P_{i} = C_{b}^{n}^{T} \cdot {}^{N}P_{i}

where $C_{b}^{n}$ denotes the attitude matrix.

4. From {B} to {C}

The rigid connection between aircraft body and camera contains a relative rotation $R_{B}^{C}$ and translation $T_{B}^{C}$ that has been accurately calibrated

{}^{C}P_{f} = R_{B}^{C} \cdot {}^{B}P_{f} + T_{B}^{C}

5. From {C} to {P}

According to the pinhole camera model,³³ the pixel coordinate projection of any point in the pixel coordinate system is

{}^{P}P_{i} = \frac{1}{{}^{C}Z_{i}} [\begin{matrix} 1 / d x & s & u_{0} \\ 0 & 1 / d y & v_{0} \\ 0 & 0 & 1 \end{matrix}] \cdot [\begin{matrix} f & 0 & 0 \\ 0 & f & 0 \\ 0 & 0 & 1 \end{matrix}] \cdot {}^{C}P_{i}

where $(u_{0}, v_{0})$ is the main point of charge-coupled device (CCD) plane, s denotes skew factor, and $d x$ and $d y$ represent the pixel size in u-axis and v-axis. f is the focal length. This complete function including equations (4) to (8) can be written as follows

{}^{P}P_{i} = F ({}^{D}P_{i}, L_{a}, λ_{a}, h_{a}, ψ, θ, ϕ)

where ψ, θ, and ϕ denote yaw, pitch, and roll of the aircraft, and L_a, $λ_{a}$ , and h_a are longitude, latitude, and altitude of the aircraft. Therefore, in the synthetic runway, the front left corner (a), the front right corner (d), and the vanishing point (v) can be individually transformed to ${}^{P}{\hat{P}}_{a} = {[\begin{matrix} {\hat{a}}_{r} & {\hat{a}}_{c} \end{matrix}]}^{T}$ , ${}^{P}{\hat{P}}_{b} = {[\begin{matrix} {\hat{b}}_{r} & {\hat{b}}_{c} \end{matrix}]}^{T}$ , ${}^{P}{\hat{P}}_{c} = {[\begin{matrix} {\hat{c}}_{r} & {\hat{c}}_{c} \end{matrix}]}^{T}$ , ${}^{P}{\hat{P}}_{d} = {[\begin{matrix} {\hat{d}}_{r} & {\hat{d}}_{c} \end{matrix}]}^{T}$ , and ${}^{P}{\hat{P}}_{v} = {[\begin{matrix} {\hat{v}}_{r} & {\hat{v}}_{c} \end{matrix}]}^{T}$ , where the pixel coordinates of vanish point v is

{\hat{v}}_{r} = \frac{({\hat{c}}_{c} - \hat{a}) ({\hat{b}}_{r} - {\hat{a}}_{r}) ({\hat{d}}_{r} - {\hat{c}}_{r})}{({\hat{b}}_{c} - {\hat{a}}_{c}) ({\hat{d}}_{r} - {\hat{c}}_{r}) - ({\hat{d}}_{c} - c_{c}) ({\hat{b}}_{r} - {\hat{a}}_{r})} + \frac{{\hat{a}}_{r} ({\hat{b}}_{c} - {\hat{a}}_{c}) ({\hat{d}}_{r} - {\hat{c}}_{r}) - {\hat{c}}_{r} ({\hat{d}}_{r} - {\hat{c}}_{r}) ({\hat{b}}_{r} - {\hat{a}}_{r})}{({\hat{b}}_{c} - {\hat{a}}_{c}) ({\hat{d}}_{r} - {\hat{c}}_{r}) - ({\hat{d}}_{c} - {\hat{c}}_{c}) ({\hat{b}}_{r} - {\hat{a}}_{r})}

{\hat{v}}_{c} = {\hat{a}}_{c} + (\frac{{\hat{b}}_{c} - {\hat{a}}_{c}}{{\hat{b}}_{r} - {\hat{a}}_{r}}) {\hat{v}}_{r} - {\hat{a}}_{r} (\frac{{\hat{b}}_{c} - {\hat{a}}_{c}}{{\hat{b}}_{r} - {\hat{a}}_{r}})

To further analyze the error distribution of the projected pixel, the error transfer equations of the row pixel and the column pixel can be obtained from equation (9)

Δ r = \frac{\partial F_{r}}{\partial L_{a}} Δ L_{a} + \frac{\partial F_{r}}{\partial λ_{a}} Δ λ_{a} + \frac{\partial F_{r}}{\partial h_{a}} Δ h_{a} + \frac{\partial F_{r}}{\partial ψ} Δ ψ + \frac{\partial F_{r}}{\partial θ} Δ θ + \frac{\partial F_{r}}{\partial ϕ} Δ ϕ

Δ c = \frac{\partial F_{c}}{\partial L_{a}} Δ L_{a} + \frac{\partial F_{c}}{\partial λ_{a}} Δ λ_{a} + \frac{\partial F_{c}}{\partial h_{a}} Δ h_{a} + \frac{\partial F_{c}}{\partial ψ} Δ ψ + \frac{\partial F_{c}}{\partial θ} Δ θ + \frac{\partial F_{c}}{\partial ϕ} Δ ϕ

where $Δ L_{a}$ , $Δ λ_{a}$ , and $Δ h_{a}$ are position errors, and $Δ ϕ$ , $Δ θ$ , and $Δ ψ$ are attitude errors. $Δ r$ is the error of pixel row, and $Δ c$ is the error of pixel column. The statistic characteristics show that $Δ L_{a}$ , $Δ λ_{a}$ , $Δ h_{a}$ , $Δ ψ$ , $Δ θ$ , and $Δ ϕ$ obey Gaussian, respectively, and the correlation coefficients among them are relatively small. Therefore, $Δ r$ and $Δ c$ obey Gaussian.

Δ r \sim (0, \sqrt{\sum_{k = 1}^{6} {(α_{k} δ_{k})}^{2}}), Δ c \sim (0, \sqrt{\sum_{k = 1}^{6} {(β_{k} δ_{k})}^{2}})

where $α_{k}$ and $β_{k}$ are the partial differential items of equations (12) and (13) respectively, and $δ_{k}$ is the pose measurement accuracy. Obviously, due to the uncertainty of INS pose data, the pixel coordinate of any runway corner falls into a certain region ( $\bar{r} - Δ r \leq \hat{r} \leq \bar{r} + Δ r$ , $\bar{c} - Δ c \leq \hat{c} \leq \bar{c} + Δ c$ ) instead of a fixed point. For further explanation, Monte Carlo simulation is used to analyze the probability distribution of runway vertex pixels. As shown in Figure 4, statistic histograms of $Δ r$ and $Δ c$ represent their probability distribution. Statistic data can also be achieved: $δ r_{a} = 6.1417$ , $\bar{r_{a}} = 276.5491$ , $P ({\bar{r}}_{a} - 3 δ r_{a} \leq r_{a} \leq {\bar{r}}_{a} + 3 δ r_{a}) = 0.9973$ , $δ c = 12.1404$ , ${\bar{c}}_{a} = 208.9976$ , $P ({\bar{c}}_{a} - 3 δ c_{a} \leq Δ c \leq {\bar{c}}_{a} + 3 δ c_{a}) = 0.9973$ .

Figure 4.

Probability distribution of point A.

Real runway features

This article improves an existing hierarchical method³¹ from coarse to fine to accurately detect and extract runway features from the FLIR image. In the coarse layer, You Only Look Once (YOLO) algorithm^34,35 is trained to detect the ROI of the runway. Due to its special geometry, line segments give a high-level description of airport runway. In the fine layer, the line segments are extracted from ROI by EDlines detector³⁶ and are classified into the corresponding line set characterized by its neighborhood and slope. Some random points are selected from these line segments by its weight. Then, these random points in each line set are fitted into a complete runway line by RANSAC. Finally, three vertexes of the runway triangle contour can be calculated by the right edge, the left edge, and the front edge of the runway.

1. ROI detection

In this article, the YOLO algorithm^34,35 is used to detect ROI of the airport runway, which is a target detection system based on a single neural network. The detection process is to (1) divide an input image into 7 × 7 networks, (2) give each network a prediction of two borders, to (3) remove the less possible window according to the 7 × 7 × 2 target windows predicted in the last step and use the non-maximum suppression to remove the redundant windows.

2. Line segments extraction

Line segments based on its special geometry can give a high-level description of airport runway. An ideal line segment detection algorithm can process any images regardless of its origin, orientation, or size and extract accurate line segments in real time without parameters tuning. Among existing algorithms, EDLines detector³⁶ and LSD²⁷ can satisfy the above-mentioned requirements. However, EDLines runs up to 11 times faster than LSD,²⁷ which makes it more suitable for real-time runway detection.

3. Runway line fitting

Due to low illumination, weak contrast, and blur in FLIR image, there are three major problems when the line segments’ detectors (e.g. Hough, LSD, and EDLines) are applied to FLIR images: (1) the detected runway edges are composed of small line segments with different orientations; (2) there are often gap-filling segments which lead to discontinuity of runway edges³⁷; (3) when the aircraft is still far from the runway, the projection region of runway onto the image plane is relatively small. It is very difficult to accurately detect and extract the back edge of the runway from the FLIR image. This article uses only three vertexes of the runway triangle as visual cues and proposes a fast method to classify and fit three sets of line segments into three runway edges, as shown in Figure 5.

Figure 5.

Runway line segments fitting.

Some symbols are defined to depict these line segments quantitatively, as follows

$T_{a b}$ , $T_{c d}$ , or $T_{a d}$ represents the neighborhood width of the ideal runway line $L_{a b}$ , $L_{c d}$ , or $L_{a d}$ . $T_{a b} = max {N_{a}, N_{b}}$ , $T_{c d} = max {N_{c}, N_{d}}$ , and $T_{a d} = max {N_{a}, N_{d}}$ , where $N_{a} = 2 \sqrt{Δ r_{a}^{2} + Δ c_{a}^{2}}$ , $N_{b} = 2 \sqrt{Δ r_{b}^{2} + Δ c_{b}^{2}}$ , $N_{c} = 2 \sqrt{Δ r_{c}^{2} + Δ c_{c}^{2}}$ , and $N_{d} = 2 \sqrt{Δ r_{d}^{2} + Δ c_{d}^{2}}$ .

$θ (L_{i}, L_{a b})$ , $θ (L_{i}, L_{c d})$ , or $θ (L_{i}, L_{a d})$ is the angle between any line segment L_i and the ideal runway line $L_{a b}$ , $L_{c d}$ , or $L_{a d}$ .

If a line segment L_i falls into the neighborhood of the ideal runway line $L_{a b}$ and $2 ° \leq θ (L_{i}, L_{a b}) \leq 2 °$ , then the line segment L_i is reserved in the line segment set of line $L_{a b}$ . In the same way, each detected line segment L_i will be reserved in its corresponding set or abandoned. Therefore, some eligible line segments fall into three sets (S_L, S_R, and S_F). Each line segment is given its weight which is equal to line width multiplied by line length and decides the number of random points selected from it. Thus, these random points in each set can be fitted into a runway edge by RANSAC.

4. Vertexes calculation

The right edge, the left edge, and the front edge of the runway intersect each other, so three intersection points are the vertices of the runway triangle, which are ${}^{P}P_{a} = {[\begin{matrix} a_{r} & a_{c} \end{matrix}]}^{T}$ , ${}^{P}P_{d} = {[\begin{matrix} d_{r} & d_{c} \end{matrix}]}^{T}$ , and ${}^{P}P_{v} = {[\begin{matrix} v_{r} & v_{c} \end{matrix}]}^{T}$ .

Vision measurement

When $ψ^{n} \to 0$ and $δ p^{n} \to 0$ , ${}^{P}{\hat{P}}_{a} ≅ {}^{P}P_{a}$ , ${}^{P}{\hat{P}}_{d} ≅ {}^{P}P_{d}$ , and ${}^{P}{\hat{P}}_{v} ≅ {}^{P}P_{v}$ . Then, the vision measurement model is designed to be as follows

y_{k} = h (x_{k}) + n_{k}

where $n_{\hat{a} r} \in ℜ$ , $n_{\hat{a} c} \in ℜ$ , $n_{\hat{d} r} \in ℜ$ , $n_{\hat{d} c} \in ℜ$ , $n_{\hat{v} r} \in ℜ$ , and $n_{\hat{v} c} \in ℜ$ are assumed to be zero-mean Gaussian white noise and $y_{k} = {[\begin{array}{l} a_{r} & a_{c} & d_{r} & d_{c} & v_{r} & v_{c} \end{array}]}^{T}$ , $h (x_{k}) = {[\begin{array}{l} {\hat{a}}_{r} & {\hat{a}}_{c} & {\hat{d}}_{r} & {\hat{d}}_{c} & {\hat{v}}_{r} & {\hat{v}}_{c} \end{array}]}^{T}$ , and $n_{k} = {[\begin{array}{l} n_{\hat{a} r} & n_{\hat{a} c} & n_{\hat{d} r} & n_{\hat{d} c} & n_{\hat{v} r} & n_{\hat{v} c} \end{array}]}^{T}$ .

Vision–inertial fusion

The UKF adopts a deterministic sampling technique to estimate the state and covariance of the nonlinear models directly. Compared with the EKF, the UKF can predict the state of the nonlinear system more accurately rather than calculate the Jacobian and Hessian matrices of the process and measurement models. Since the UKF need to calculate the square root of state covariance matrix during sigma points at each time update, it may cause negative definite state covariance matrix. However, the SR-UKF requires less numerical computations and has more accuracy using a Cholesky factorization of the error covariance matrix in propagation directly.³² The visual–inertial fusion based on SR-UKF is presented as follows

1. Initialization

{\hat{x}}_{0} = E (x_{0}) \approx \frac{1}{n} \sum_{i = 1}^{n} x_{i,0}

P_{x,0} \approx \frac{1}{n} \sum_{i = 1}^{n} (x_{i,0} - {\bar{x}}_{0}) {(x_{i,0} - {\bar{x}}_{0})}^{T}

Calculate the matrix square root of the initial state covariance

S_{0} = cholesky (P_{x,0})

2. Time update

Sigma points calculation

χ_{k -1} = [\begin{matrix} {\hat{x}}_{k -1} & {\hat{x}}_{k -1} + γ S_{k} & {\hat{x}}_{k -1} - γ S_{k} \end{matrix}]

One-step state prediction

Φ_{k / k - 1} = I + A_{k - 1} Δ t

χ_{k / k - 1} = Φ_{k / k - 1} x_{k - 1}

{\hat{x}}_{k}^{-} = \sum_{i =0}^{2 n} W_{i}^{(m)} χ_{i, k / k - 1}

Square root of one-step state prediction

$S_{k}^{-} = qr {[\begin{matrix} \sqrt{W_{1}^{(c)}} (χ_{1 : 2 n, k / k - 1} - {\hat{x}}_{k}^{-}) & \sqrt{R^{v}} \end{matrix}]}$ // $R^{v}$ is the system noise covariance matrix

$S_{k}^{-} = cholesky (\begin{matrix} S_{k}^{-}, & χ_{0, k} - {\hat{x}}_{k}^{-}, & W_{0}^{(c)} \end{matrix})$ // The function cholesky () denotes a Cholesky matrix decomposition.

Vision measurement prediction

$y_{k / k - 1} = h (χ_{k / k - 1})$ // Synthetic image projection

${\hat{y}}_{k}^{-} = \sum_{i = 0}^{2 n} W_{i}^{(m)} y_{i, k / k - 1}$ // Estimation of synthetic runway features

3. Vision measurement update

$S_{{\hat{y}}_{k}} = qr {[\begin{matrix} \sqrt{W_{1}^{(c)}} (y_{1 : 2 n, k} - {\hat{y}}_{k}^{-}) & \sqrt{R_{k}^{n}} \end{matrix}]}$ // $R^{n}$ denotes the measurement noise covariance matrix, the function qr () denotes QR matrix decomposition and returns an upper triangular matrix.

S_{{\hat{y}}_{k}} = cholesky (\begin{matrix} S_{{\hat{y}}_{k}}, & y_{0, k} - {\hat{y}}_{k}, & W_{0}^{(c)} \end{matrix})

P_{x_{k}, y_{k}} = \sum_{i = 0}^{2 n} W_{i}^{(c)} (χ_{i, k / k - 1} - {\hat{x}}_{k}^{-}) {(y_{i, k / k - 1} - {\hat{y}}_{k})}^{T}

K_{k} = (P_{x_{k} y_{k}} / S_{\hat{y} k}^{T}) / S_{{\hat{y}}_{_{k}}}, U = K_{k} S_{{\hat{y}}_{k}}

State noise prediction:

${\hat{x}}_{k} = {\hat{x}}_{k}^{-} + K_{k} (y_{k} - {\hat{y}}_{k}^{-})$ // Estimated state

$S_{k} = cholesky (\begin{matrix} S_{k}^{-} & U, & - 1 \end{matrix})$ // Estimated state noise

In the proposed method, the scalar weights are defined as: $γ = \sqrt{n + λ}$ , $W_{i}^{m} = W_{i}^{c} = 1 / 2 (n + λ)$ , $i = 1, 2, \dots, 2 n$ . For $i = 0$ , $W_{0}^{m} = λ / γ^{2}$ , $W_{0}^{(c)} = λ / γ^{2} + (1 + α^{2} + β)$ , where $λ = α^{2} (n + k) - n$ with parameters set $α = 0.1$ , $β = 2$ , and $k = 0$ .

Images registration

Considering the flight safety, CVS must provide a conformal display between real and synthetic images, otherwise the mismatched image can mislead or confuse the pilot. The accuracy of the CVS image should not result in a greater than 5 mrad display error at the center of the display at a range of 609.6 m (30.48 m altitude on a 3° glideslope).² Therefore, this ergonomics requirement should be changed to the pixel deviation of key elements between real and synthetic images in the horizontal and vertical directions, respectively.

Assuming that the distance from the design eye position to the screen center of the head down display screen is d, the pixel sizes of the head down display (HDD) screen in the horizontal and vertical direction are s_h and s_v, respectively, then the horizontal pixel deviation of image alignment should not exceed $\pm 0.0025 \times d / s_{h}$ pixels, and the vertical pixel deviation should not exceed $\pm 0.0025 \times d / s_{v}$ pixels. Especially in CVS, the reasonable distance $d = 600 mm$ , the typical pixel size $s_{h} = 0.219 mm$ , $s_{v} = 0.219 mm$ , then the horizontal alignment deviation should not exceed ±7 pixels, and the vertical alignment deviation should not exceed the ±7 pixels. In this article, the validity of the proposed algorithm is verified by real flight data in “Images registration” section.

Experiments

Vehicle and sensor setup

The flight data acquisition platform is Y-12F general aircraft which is equipped with a vision sensors suite (including a FLIR camera), an INS (Applanix AV510), a flight data recorder (FDR, AMPEX miniR 700), an air data computer (ADC, CAIC XSC-6E), and a radio altimeter (RALT, Honeywell KRA405b). The DGPS base station (Trimble R5) is used to provide the ground truth by inertial-DGPS integration. The FDR is charge of recording the real-time measurements of INS, GPS, ADC, and RALT. The complete structure of the flight data acquisition system is shown in Figure 6.

Figure 6.

The structure of the flight data acquisition system. FLIR: forward-looking infrared.

The vision sensors suite is mounted on the front of the aircraft radome, which consists of a short-wave infrared (SWIR) camera (NIP PHK03M100CSW0, NIP Co., Ltd., PAMINA), a visible light camera, a mounting bracket, and a metal shell, as shown in Figure 7. This article only discusses FLIR images. Furthermore, the INS and FDR are installed on the cabin deck of the aircraft. The flight data have been collected at a general aviation airport (Pu Cheng, China), and the terrain data of the airport and its surrounding have been surveyed. There are five data sets of takeoffs and landings including fog, haze, cloudy, and sunny weather conditions. Each data set consists of FLIR video (frame rate 24 Hz, resolution 640 × 512), INS date (update 100 Hz), GPS data (update 20 Hz), ADC data (update 16 Hz), and radar altimeter (update 16 Hz), which are labeled with time stamp to synchronize measurements. Moreover, the relative position $T_{B}^{C}$ and orientation $R_{B}^{C}$ between FLIR camera and INS has been calibrated by a total station instrument.

Figure 7.

Vision sensors installation and aircraft landing.

Besides runway detection needs to train YOLO neural network on workstation, other experiments are implemented on Jetson TX2 board. As shown in Figure 8, the experiment platform is a NVIDIA Jetson TX2-embedded computer board with 6 ARM CPU cores, 256 Pascal GPU cores, and 8 GB memory. Furthermore, a complete landing process under fog condition is used for verifying the proposed algorithm. The aircraft descended from 152.4 m to 14.32 m, through three typical altitudes of 60.96 m, 30.48 m, and 18.29 m, flying for 59.45 s.

Figure 8.

The structure of CVS platform prototype. CVS: combined vision system.

Runway detection

Runway detection algorithm includes two parts: runway recognition and feature extraction. In the first part, tiny YOLO is used to recognize the runway target from FLIR video. YOLO needs to be trained on the workstation with Intel Core i7-6700, 3.40 GHz CPU, 8 GB RAM, NVIDIA display card 1070, and OS Ubuntu14.04. In this article, 1000 photos are selected from the FLIR video, in which 800 photos are used as training sets and 200 photos as test sets. This training process iterates 80,200 times and runs 36 h. Therefore, the trained YOLO runs on the embedded computer board TX2, reads FLIR video, and recognizes runway ROI. In the second part, EDLines detector is used to extract line segments from ROI, then these line segments are fitted into runway edges.

As shown in Figure 9, the three row images from up to down are captured at 60.96 m, 30.48 m, and 18.29 m, respectively. In the left column, the red rectangles denote the estimated ROI of runway images. In the middle column, the blued trapezoids represent the synthetic runway edges, which are used to estimate each neighborhood of runway edges. Moreover, some short line segments (red color) are extracted from ROI by EDLines detector. In the right column, the green lines show the three edges of runway.

Figure 9.

Runway detection.

The statistics of runway detection listed in Table 1 show that the pixels ratio of ROI/CCD is less than 25%. Obviously, the proposed method is more efficient than others,³⁸ because of narrowing the detection area.

Table 1.

Experimental result of the runway features under different scenarios.

Scenarios	Flight height (m)	ROI (pixels)	ROI/CCD ratio	Lines
1	60.96	49 × 77	0.0115	16
2	30.48	106 × 214	0.0692	58
3	18.29	164 × 488	0.2442	173

ROI: region of interest.

Motion estimation

Among motion parameters, the pitch and altitude of aircraft have great influence on the 2-D–3-D image registration of CVS.³⁹ Therefore, eliminating the random errors of these parameters can effectively guarantee conformal display. In our experiments, the result of INS/DGPS integration is selected as ground truth, and the proposed algorithm is compared with pure INS mode, INS/GPS mode,⁴⁰ and INS/GPS/ADC/RALT mode.⁴¹ As shown in Figure 10, the pose error of pure INS is larger than the others, while the estimated pose error of our method is smaller than those of the others. Especially the pitch and altitude of the proposed algorithm are closest to the ground truth.

Figure 10.

Pose errors estimation. (a) Euler angles and (b) position.

The root mean squared errors (RMSEs) of INS motion parameters are listed in Table 2. This statistic shows that the proposed method is superior to other methods.

Table 2.

The RMS errors of motion estimation.

Mode/Parameters	Pitch (°)	Roll (°)	Yaw (°)	Xn (m)	Xe (m)	Xu (m)
INS	0.5166	0.0739	0.0852	7.2992	16.7021	7.1074
INS/GPS	0.3396	0.0278	0.0415	3.8342	2.3345	3.2296
INS/GPS/ADC/RALT	0.4142	0.0357	0.0739	6.4901	11.0763	3.386
INS/FLIR	0.0196	0.0128	0.0242	0.9788	0.3116	0.0580

INS: inertial navigation system; ADC: air data computer; RALT: radio altimeter; FLIR: forward-looking infrared; RMS: root mean square.

Images registration

To verify this proposed algorithm, the synthetic runway features derived by the filtered pose parameters is compared with the real runway features extracted from FLIR image, then the registration validation is judged by the pixel deviation discussed in the section “Images registration.” The triangle area formed by point A, point D, and point V is in the center of the combined vision, and it includes the key elements of a runway. Therefore, the pixel deviations of these three points can fully reflect the registration accuracy of real and synthetic images. As shown in Figure 11, the row and column pixel errors of the front left corner A, the front right corner point D, and the vanish point V are no greater than 7 pixels, which meets the requirement of real and synthetic image registration.

Figure 11.

Pixel errors of real and synthetic image registration. (a) row/column pixel bias of point A; (b) row/column pixel bias of point D; and (c) row/column pixel bias of point V.

In addition, the RMSEs of pixel-level registration of point A, D, and V are listed in Table 3. The registration errors (row and column) of INS/GPS/ADC/RALT mode are slightly larger than those of the others, while the registration errors of the proposed algorithm are smaller than those of INS/GPS and INS/GPS/ADC/RALT and even superior to the registration accuracy of INS/DGPS.

Table 3.

RMS pixel registration errors of points A, D, and V.

Height	Method	$Δ r_{a}$	$Δ c_{a}$	$Δ r_{d}$	$Δ c_{d}$	$Δ r_{v}$	$Δ c_{v}$
152.4–60.96 m	A	0.9470	11.227	0.9555	11.2857	0.9239	11.2867
	B	0.1894	2.2455	0.1911	2.2571	0.1848	2.2573
	C	8.5882	8.0073	8.8217	8.0548	8.9005	8.0432
	D	0.4623	0.3984	0.4616	0.3983	0.4626	0.3991
60.96–30.48 m	A	0.4753	13.903	0.6178	14.333	0.3616	14.3497
	B	0.1951	2.1807	0.1236	2.8668	0.1723	2.8699
	C	22.507	13.159	24.251	13.540	24.329	13.5286
	D	0.2803	0.2418	0.2975	0.2412	0.2781	0.2460
30.48–18.29 m	A	0.1950	19.4435	1.1903	21.533	1.1913	21.697
	B	0.1390	3.1887	0.2381	4.3066	0.2383	4.3395
	C	56.1839	16.995	66.325	18.508	67.432	18.4861
	D	0.3629	0.2096	0.5082	0.4955	0.4170	0.4151
18.29–14.32 m	A	0.2229	25.3636	2.9665	30.9877	3.5960	31.092
	B	0.1446	5.2727	0.5933	6.1975	0.7192	6.2184
	C	100.35	24.827	129.49	29.782	133.44	29.665
	D	0.3062	0.3689	0.7954	0.5173	0.9023	0.4122

A: INS/GPS; B: INS/DGPS; C: INS/GPS/ADC/RALT; D: INS/FLIR; RMS: root mean square; INS: inertial navigation system; ADC: air data computer; RALT: radio altimeter.

As shown in Figure 12, a prototype of CVS based on embedded computer board (NVIDIA Jetson TX2) is realized. The synthetic 3-D scene is generated by open source 3-D engine (Open Scene Graph, OSG) and is derived by terrain database and the corrected aircraft pose parameters. The flight symbols are designed by electronic instruments development software (ANSYS SCADE Display) and be superimposed on the synthetic scene. The accurate registration between 3-D synthetic scene and 2-D FLIR image is verified by the proposed algorithm. When the application runs on NVIDIA Jetson TX2, the frame rate of CVS is 40–44 Hz. The processing time of each frame is 22.7–25 ms, in which the average time of runway detection is 2 ms, data processing occupied 14–16 ms, and image rendering needs 6–7 ms.

Figure 12.

The prototype of CVS based on Jetson TX2. CVS: combined vision system.

Conclusion and future work

In this article, a novel visual–inertial fusion-based registration between real and synthetic images in CVS is proposed. To improve the robustness of CVS, vision observations are established by the three vertices of runway contour extracted from the FLIR images, and the propagation model is formed by the inertial error transfer equation. Through SR-UKF fusion of visual information and inertial data, the errors of inertial measurement can be estimated precisely and the accuracy of pose estimation can be improved. Finally, our proposed method is verified by real flight data. The accurate registration between 3-D synthetic scene and 2-D FLIR image can be achieved. In future works, the authors could use an open-source camera/INS calibration toolbox—ETH Kalibr^42
–44 instead of an electronic total station for calibration.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by AVIC Technology Innovation Foundation of China (No. 2014D63130R) and Aviation Science Foundation of China (No. 2014ZC31004 and No. 2017ZC31008).

ORCID iD

Lei Zhang

Zhengjun Zhai

References

AIRBUS. A statistical analysis of commercial aviation accidents 1958–2017. Report, Blagnac Cedex, France, May 2018.

RTCA DO-315B: 2013. Minimum aviation system performance standard (MASPS) for enhanced vision systems, synthetic vision systems, combine vision systems and enhanced flight vision systems.

Shelton

Kramer

Ellis

. Synthetic and enhanced vision systems for Nextgen (SEVS) simulation and flight test performance evaluation. In: 2012 IEEE/AIAA 31st digital avionics systems conference (DASC), Williamsburg, VA, USA, 14–18 October 2012, pp. 2D5-1–2D5-12. IEEE.

Jennings

Craig

Link

. Sources of error in a helmet-mounted enhanced and synthetic vision system. In: Proceedings of SPIE 4711 conference on helmet and head-mounted displays, Orlando, FL, 5 August 2012, pp. 316–327. SPIE.

Kermen

Aydin

Ercan

. A multi-sensor integrated head-mounted display setup for augmented reality application. In: 3DTV-Conference: the true vision – capture, transmission and display of 3D video, Lisbon, Portugal, 8–10 July 2015.

Pancholi

Dimitrov

Schmitz

. Relative translation and rotation calibration between optical target and inertial measurement unit. In: 2016 international conference on sensor systems and software, Sophia Antipolis, France, 20 July 2016, pp. 175–186. Switzerland: Springer.

Teixeira

Maffra

Moos

. VI-RPE: Visual-inertial relative pose estimation for aerial vehicles. IEEE Robot Autom Lett 2017; 3(4): 2770–2777.

Huck

Westenberger

Fritzsche

. Precise timestamping and temporal synchronization in multi-sensor fusion. In: IEEE intelligent vehicles symposium (IV), Baden-Baden, Germany, 5–9 June 2011.

Lynen

Achtelik

Weiss

. A Robust and modular multi-sensor fusion approach applied to MAV navigation. In: IEEE/RSJ international conference on intelligent robots and systems (IROS), Tokyo, Japan, 3–7 November 2013, pp. 3923–3929. IEEE.

10.

Rambach

Pagani

Lampe

. Fusion of unsynchronized optical tracker and inertial sensor in EKF framework for in-car augmented reality delay reduction. In: IEEE16th international symposium on mixed and augmented reality (ISMAR), Nantes, France, 9–13 October 2017, pp. 109–114. IEEE.

11.

Ververs

Suddreth

. Design and flight test of a primary flight display combined vision system. SAE Int J Aerospace 2011; 4: 738–750.

12.

Bernd

Sven

Bernd

. Combing enhanced and synthetic vision for autonomous all-weather approach and landing. Int J Aviat Psychol 2009; 19(1): 49–75.

13.

Vygolov

Zheltov

. Enhanced, synthetic and combined vision technologies for civil aviation. In: Favorskaya

Jain

(eds) Computer Vision in Control Systems-2. Intelligent Systems Reference Library, 75, 2015, pp. 201–230. Switzerland: Springer, Cham.

14.

Lebedev

Stepaniants

Komarov

. A real-time photogrammetric algorithm for sensor and synthetic image fusion with application to aviation combined vision. In: ISPRS Technical Commission III Symposium, Zurich, Switzerland, 5–7 September 2014, pp. 171–175. ISPRS.

15.

Satish Kumar

Kashyap

Naidu

VPS

. Integrated enhanced and synthetic vision system for transport aircraft. Defence Sci J 2013; 63(2): 157–163.

16.

Tang

Shen

. Ground stereo vision-based navigation for autonomous take-off and landing of UAVs: a chan-vese model approach. Int J Adv Robot Syst 2016; 13: 67–80.

17.

Shen

. Stereo vision guiding for the autonomous landing of fixed-wing UAVs: a saliency-inspired approach. Int J Adv Robot Syst 2016; 13: 43–55.

18.

Kong

Zhang

. Localization framework for real-time UAV autonomous landing: an on-ground deployed visual approach. Sensors 2017; 17(6): 1437–1453.

19.

Yang

. A ground-based near infrared camera array system for UAV auto-landing in GPS-denied environment. Sensors 2016; 16(9): 1393–1412.

20.

Gui

Guo

Zhang

. Airborne vision-based navigation method for UAV accuracy landing using infrared lamps. J Intell Robot Syst 2013; 72(2): 197–218.

21.

Cai

Sun

. Vision aided INS for UAV auto landing navigation using SR-UKF based on two-view homography. In: Proceedings of 2014 IEEE Chinese Guidance, Navigation and Control Conference (CGNCC), Yantai, China, 8–10 August 2014, pp. 518–522. IEEE.

22.

Fan

Ding

Cao

. Vision algorithms for fixed-wing unmanned aerial vehicle landing system. Sci China Technol Sci 2017; 60(3): 434–443.

23.

Ruchanurucks

Rakprayoon

Kongkaew

. Automatic landing assist system using IMU + PnP for robust positioning of fixed-wing UAVs. J Intell Robot Syst 2017; 90(1): 189–199.

24.

Patruno

Nitti

Petitti

. A vision-based approach for unmanned aerial vehicle landing. J Intell Robot Syst 2018; 1–20.

25.

Gibert

Plestan

Burlion

. Visual estimation of deviations for the civil aircraft landing. Control Eng Pract 2018; 75: 17–25.

26.

Xia

Xiang

. Recognition of airport runways in FLIR images based on knowledge. IEEE Geosci Remote S 2014; 11(9): 1534–1538.

27.

von Gioi

Jakubowicz

Morel

. LSD: a fast line segment detector with a false detection control. IEEE Trans Pattern Anal 2010; 32(4): 722–732.

28.

Liu

Zhao

Zhang

. Runway extraction in low visibility conditions based on sensor fusion method. IEEE Sens J 2014; 14(6): 1980–1987.

29.

Liu

Cheng

Basu

. Real-time runway detection for infrared aerial image using synthetic vision and an ROI based level set method. Remote Sens 2018; 10(1544): 1–16.

30.

Moore

Schubert

Dolph

. Machine vision identification of airport runways with visible and infrared videos. J Aerosp Inf Syst 2016; 13(7): 266–277.

31.

Zhang

Cheng

Zhai

. Real-time accurate runway detection based on airborne multi-sensors fusion. Defence Sci J 2017; 67(5): 542–550.

32.

der Merwe

Wan

. The square-root unscented Kalman filter for state and parameter-estimation. In: 2001 IEEE international conference on acoustics, speech, and signal processing (ASSP), Salt Lake City, USA, 7–11 May 2001, pp. 3461–3464. IEEE.

33.

Hartley

Zisserman

. Multiple view geometry in computer vision. U.K. Cambridge: Cambridge University Press, 2003, pp.153–158.

34.

Redmon

Divvala

Girshick

. You only look once: unified, real-time object detection. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016, pp. 779–788. IEEE.

35.

Redmon

Farhadi

. YOLO9000: better, faster, stronger. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), Honolulu, HI, USA, 22–25 July 2017, pp. 6517–6525. IEEE.

36.

Akinlar

Topal

. EDLines: A real-time line segment detector with a false detection control. Pattern Recogn Lett 2011; 32(13): 1633–1642.

37.

Borges

PVK

Vidas

. Practical infrared visual odometry. IEEE Trans Intell Transp 2016; 17(8): 2205–2213.

38.

Satish Kumar

Kashyap

Shantha Kumar

. Detection of runway and obstacles using electro-optical and infrared sensors before landing. Defence Sci J 2014; 64(1): 67–76.

39.

Zhang

Zhai

Cheng

. Analysis of projection accuracy of synthetic image in combined vision system. In: 2016 IEEE 12th International Conference on Computational Intelligence and Security (CIS), Wuxi, China, 17–18 December 2016, pp. 95–99. IEEE.

40.

Khalaf

Chouaib

Wainakh

. Novel adaptive UKF for tightly-coupled INS/GPS integration with experimental validation on an UAV. Gyroscopy Navig 2017; 8(4): 259–269.

41.

Gay

Maybeck

. Integrated GPS/INS/BARO and radar altimeter system for aircraft precision approach landings. Inert Navig Syst 1994; 1(1): 161–168.

42.

Furgale

Barfoot

Sibley

. Continuous-time batch estimation using temporal basis functions. In: 2012 IEEE international conference on robotics and automation (ICRA), Saint Paul, MN, USA, 14–18 May 2012, pp. 2088–2095. IEEE.

43.

Furgale

Rehder

Siegwart

. Unified temporal and spatial calibration for multi-sensor systems. In: 2013 IEEE/RSJ international conference on intelligent robots and systems (IROS), Tokyo, Japan, 3–7 November 2013, pp. 1280–1286. IEEE.

44.

Rehder

Nikolic

Schneider

. Extending kalibr: calibrating the extrinsics of multiple IMUs and of individual axes. In: 2016 IEEE international conference on robotics and automation (ICRA), Stockholm, Sweden, 16–21 May 2016, pp. 4304–4311. IEEE.