Research on solving heading attitude of airdrop cargo platform based on line features

Abstract

The present study envisages the development of an improved line features method to accurately estimate the attitude of the airdrop cargo platform during airdrop landing. Therefore, this article uses the geometric characteristics of the line features to improve the traditional line features extraction and removes the locally dense line features in the image, which greatly reduces the number of line features in the image. Then, the improved random sample consensus is used to remove the mismatching of line features, which improves the real-time performance of the algorithm and the accuracy of the attitude angle, and makes up for the problem of difficult extraction of point features or low matching accuracy in the airdrop environment. Finally, a constraint equation is established for the line features that are successfully matched, and using homography to obtain attitude of the airdrop cargo platform. This article also meets the requirements of accurate calculation attitude of airdrop cargo platform. The experiment shows the significance and feasibility of the airdrop cargo platform heading and attitude calculation technology based on the line feature, and it has a good application prospect.

Keywords

Airdrop cargo platform line features attitude calculation aircraft landing

Introduction

With the development in the large-scale military transport aircrafts, the airdrop of heavy equipment and materials is the basic capability of large-scale transport aircraft. Airdrop equipment uses airdrop cargo platform as carrier. To achieve the smooth landing of the airdrop cargo platform, it is important to calculate the three-axis attitude angle of the airdrop cargo platform in real time to avoid the overturning accidents.^1

–4 Attitude estimation is to measure the position and attitude of the aircraft through one or more sensors such as inertial measurement unit (IMU), global positioning system (GPS), vision, and laser. Research scholars have done a lot of research on the problem of multi-sensor data fusion to estimate the aircraft’s attitude.^5

–11

With the development of computer vision, it has attracted the attention of research scholars to estimate the attitude of aircraft by vision. During the landing of the airdrop cargo platform, haze or clouds may appear at high altitudes, the clarity of the collected images is reduced, and the ground point features are masked and difficult to identify. However, the line features can still be identified according to the local tracking direction in the images, and the line features can better represent the image information. Therefore, the line features are more stable in complex environment, so as to the algorithm based on line features has attracted more and more attention.^12

–16 Xu et al.^16
–18 independently estimated the landing attitude of unmanned aerial vehicle (UAV) by using the actual shape of the ground cooperative target and shadow line features and verified that the attitude estimation using line features has higher accuracy and reliability than point features. Li et al.¹⁹ proposed a point-line stereo visual simultaneous localization and mapping (SLAM) system with semantic invariants. The system improved the accuracy of line features matching by fusing line features with image semantic invariant information. Yu et al.¹⁸ proposed a non-iterative method to solve the perspective-n-line problem. Firstly, the objective function of nonlinear least squares was first expressed by Cayley as the parameterized rotation matrix. A third-order equation was derived from the optimal conditions, and then the camera attitude was directly solved based on Grobner technique. Zuo et al.²⁰ tried to use Plück coordinates to represent the line features in the front-end algorithm of the SLAM system and to express parameterized line features in the back-end optimization algorithm by minimum orthogonality. Pumarola et al.²¹ added line features on the basis of ORB-SLAM, and put forward the method of system initialization by using line features. Wang et al.²² added the line angle to the construction of line features re-projection error. He et al.²³ put forward point and line features visual -inertial odometry (PL-VIO) system by adding line features on the basis of visual inertial navigation system (VINS)-Mono. Gomez-Ojeda et al.²⁴ proposed to use geometric constraints to construct a method of minimizing the L1 norm to complete line feature matching. Compared with the traditional method based on descriptors, this method improved the tracking efficiency and matching quality of line feature. Wang et al.^24,25 used point and line features to fuse in SLAM and established two projection models to estimate the motion state of the camera.

For the attitude acquisition of airdrop cargo platform, only the obvious features in the scene need to be extracted as the ground reference, and there is no need to extract all the features in the scene carefully. Using line segment detector (LSD)²⁶ and line band descriptor (LBD)²⁷ to detect line features in the image is a more mainstream method. LSD directly groups adjacent pixels into line segment regions according to the consistency of pixel gradient directions and can obtain sub-pixel-level precision detection results in linear time. Although LSD has achieved good results in terms of speed and accuracy, it also has its own shortcomings. Because it does not have a screening and merging mechanism when extracting features, it collects a large number of similar line features in local dense areas of the image, and it is prone to over-segmentation of line features. Too many short line features greatly increase the time for line features detection and matching and also increase the probability of mismatching line features, which inevitably affect the operation efficiency of algorithm.

System overview

To reduce the number of line features, gradient density filter is used to eliminate local dense gradient regions to avoid a large number of invalid features, thus improving the efficiency of features extraction and matching, and reducing the computational complexity and mismatching rate of the algorithm. The improved LSD restricts the length of line features, which improves the quality of extracted line features, increases the probability of the same features appearing between adjacent frames, and saves the time of detecting too many short line features. The improved LSD groups the line features that have not been eliminated, and then roughly screens them according to the length, angle, horizontal and vertical distance, and divides the adjacent line features into a group. In the line segment merging stage, line features angle, endpoints of line, and angle are introduced as merging standards, and line features merging is completed. The LBD matches the line features. The same line has corresponding geometric relationship (parallelism, length approximation, and overlap) in the matching of two adjacent frames. This relationship can effectively reduce the mismatching rate of line features and save the calculation time of random sample consensus (RANSAC) to eliminate mismatching. Finally, a constraint equation is established for the successfully matched line features, and the homography matrix is solved and then to obtain the triaxial attitude angle of the airdrop cargo platform. The system overview is shown in Figure 1.

Figure 1.

System overview.

Eliminate local dense line features

Line features usually appear in dense gradient areas. If the number of pixels with high gradient in the target area is too high, it means that the target area is too dense, so it is a dense line features area, which is eliminated to reduce feature mismatch.

In the airdrop environment, there may be dense grass or soil texture on the ground. A large number of line features are detected in the dense texture areas. Excessive density of line features in the same area usually lead to mismatching. To eliminate the line features of the local dense gradient region, the local gradient filter²⁸ is used to filter out the region. According to different airdrop environments, the adaptively pixel gradient density threshold is set as the filtering standard.

The pixel gradient density is defined as the percentage of pixels with higher gradients in the unit pixel gradient area, which is used to measure whether the area is feature-intensive. Set the gradient value ${IG}_{i j}$ of pixel point $(i, j)$ , when the gradient value ${IG}_{i j}$ is greater than or equal to the preset intensity threshold IG, the value $o_{i j}$ is marked as 1, otherwise it is 0

IG = \frac{\sum {IG}_{i j}}{(i - 2) \times (j - 2)}

ο_{i j} = {\begin{matrix} 0 | {IG}_{i j} | \leq IG \\ 1 | {IG}_{i j} | > IG \end{matrix}

The dense gradient of the local target area $ρ_{i j}$ is the sum of the values $\sum_{i = 1}^{m} \sum_{j = 1}^{m} o_{i j}$ in the $M \times M$ area centered on the pixel $(i, j)$ , which represents the percentage of pixel points whose gradient is greater than the set threshold. Initially set $M = 5$ , look for a small range of high-gradient dense areas in the image, and then gradually expand the value of M until $M_{l} = 30$

ρ_{i j} = \frac{\sum_{i = 1}^{m} \sum_{j = 1}^{m} o_{i j}}{M_{l} \times M_{l}}

To reduce pixel areas with too high-gradient density, filter is performed to reduce the dense feature areas in the image. $ρ_{gd}$ is the average gradient density threshold of the whole image, and adaptive threshold $ρ_{gd}$ can be obtained according to different scenes to measure the line density of the whole image. The areas with gradient density $ρ_{i j}$ greater than $ρ_{gd}$ are regarded as line dense areas, while are regarded as invalid areas and the invalid areas are not processed. Only the line features areas where the gradient density $ρ_{i j}$ is less than $ρ_{gd}$ in the image are retained

ρ_{gd} = \frac{\sum_{i = 1}^{i} \sum_{j = 1}^{j} 0_{i j}}{i \times j}

{GD}_{i j} = {\begin{matrix} 1 ρ_{i j} \leq ρ_{gd} \\ 0 ρ_{i j} > ρ_{gd} \end{matrix}

After the filter is added, all line features in the area are eliminated and replaced by the contour edges of the area, so as to avoid a large number of invalid features, which reduces the calculation cost, and improve system accuracy.

Improved line features extraction algorithm

The line features detected by LSD contain a large number of short line features, so the length of short line features needs to be constrained. Short line features are filtered out, the line features that occupy more pixels are retained, the remaining line features are grouped, and merge and connect the adjacent effective short line features in the group, as shown in Figure 2. These improved the quality of line features and made the line features evenly distributed in the image, so as to provide stable line features.

Figure 2.

Line features merging.

Line features grouping

Line features grouping needs to filter the length of the line feature, and then filter the angle and distance. The angle calculation of adjacent line features is relatively simple, so it is used as the first screening condition. The horizontal distance and vertical distance only involve the addition and subtraction of absolute values of formulas, which make the screening process more efficient. The process is shown in Figure 3. Geometric relationship between Group GL1 inner line features and longest line feature L1 is shown in Figure 4.

Figure 3.

Grouping of line features.

Figure 4.

Adjacent line feature geometric relationship.

Line features length screening

Long line features could be detected by multi-frame images and have a longer life span. Moreover, the length of the line features between two adjacent frames is almost unchanged, which is considered as a more stable feature. The basis for screening the length of line features is constrained according to the size of different images collected by airdrop cargo platform, the depth of scenes, and the number of extracted line features and sets an appropriate threshold to constrain the length of the line feature. The difference in image size determines the length of effective line features. The higher the resolution of images, the more pixels should be included. Remove short line features based on the number of pixels contained in the line feature. Set $L_{min}$ as the lower limit of the length of the line features in the detection image, the length and width of the image size are h_i and w_i , respectively, and take the shortest side of the image size as a reference to determine the length of different effective line features.

$η$ is the ratio coefficient of the long line features in the entire image and is set to 0.245. Ensure that the number of extracted line features in each frame of image is between 20 and 300, so as to meet the line feature merging in the subsequent steps

L_{i} \geq (L_{min} = η \times [min (h_{i}, w_{i})]) i \in (1, k)

Line features angle screening

The line features retained after length filtering are sorted in descending order by the pixel length to obtain $L_{i} = {L_{1}, L_{2}, \dots, L_{n}}$ , where L_n represents the nth line feature. Longer line features come from areas with large continuous gradient differences, which made the extracted line features more accurate and stable, so the line features grouping starts from the longest line feature L ₁.

To express the line features conveniently, the line features in the image are represented by vectors. The starting point $A_{i} (x_{A_{i}}, y_{A_{i}})$ of a line features L_i points to the endpoints $B_{i} (x_{B_{i}}, y_{B_{i}})$ , and the angle with the horizontal is $θ_{i}$ . The starting point $A_{j} (x_{A_{j}}, y_{A_{j}})$ of a line features L_j points to the endpoints $B_{j} (x_{A_{j}}, y_{A_{j}})$ , and the angle with the horizontal is $θ_{j}$ . After angle screening, the candidate line features group $L_{α}$ with close angles in the group is obtained as

L_{α} = {\forall L_{i} \in L : (| θ_{i} - θ_{j} | < θ_{s})}

$θ_{s}$ is the set angle threshold.

Line features plane distance screening

Screen the plane distance by the endpoints of different line features in the group. After horizontal distance screening, the candidate line feature groups $L_{hd}$ are obtained as

L_{hd} = {\forall L_{i}, L_{j} \in L_{α} : min [\begin{matrix} | x_{A_{i}} - x_{A_{1}} | \\ | x_{A_{i}} - x_{B_{1}} | \\ | x_{B_{i}} - x_{A_{1}} | \\ | x_{B_{i}} - x_{B_{1}} | \end{matrix}] < d_{s}}

Through vertical distance screening, the candidate line features $L_{vd}$ are obtained as

L_{vd} = {\forall L_{i}, L_{j} \in L_{hd} : min [\begin{matrix} | y_{A_{i}} - y_{A_{1}} | \\ | y_{A_{i}} - y_{B_{1}} | \\ | y_{B_{i}} - y_{A_{1}} | \\ | y_{B_{i}} - y_{B_{1}} | \end{matrix}] < d_{s}}

d_s is a screening threshold that measures the closeness of the horizontal distance and the vertical distance of the line feature. We set the distance threshold d_s to 2 to obtain a line segment group $G_{L_{1}} = {L_{2}, L_{3}, \dots, L_{n}}$ that is close to the long line feature L ₁, so that the filtered line segment is guaranteed to be near L ₁, which provides high-quality line features for the subsequent merging of line features.

Line features merging

After line features grouping, we get a line features grouping $G_{L_{1}}$ that matches the longest line features L ₁ in the image. It is necessary to merge the line features in this grouping to reduce the number of line features in the image and improve the quality of line features. In the process of merging, further screening of line features in the group are required. The screening steps included angle correction of line features, endpoint screening, merging, and finally rechecking the angle of the newly merged line features, as shown in Figure 5. The geometric relationship of merging L1 and Li into Lz is shown in Figure 6.

Figure 5.

Merging of line features.

Figure 6.

Merge line features position relationship.

Angle correction

The angle deviation, distance and length between Group $G_{L_{1}}$ inner line features and L ₁ are important factors for the new line features to be merged. These factors need to modify the angle threshold in $G_{L_{1}}$ , add different weighting coefficients, balance the influence of different factors on the angle correction, and finally get the angle correction threshold

λ_{i} = \frac{1}{u} \frac{| θ_{i} - θ_{1} |}{θ_{1}} + \frac{1}{v} \frac{d_{1 i}}{L_{1}} + \frac{L_{i}}{L_{1}}

The distance between two line features is expressed as follows

d_{1 i} = \frac{| (y_{A_{1}} - y_{B_{1}}) x_{B_{i}} + (x_{B_{1}} - x_{A_{1}}) y_{B_{i}} |}{\sqrt{{(x_{B_{1}} - x_{A_{1}})}^{2} + {(y_{A_{1}} - y_{B_{1}})}^{2}}}

where $\frac{1}{u}$ is the weighting coefficient of the angle of two line features, $0 < v < 1$ , $u = \frac{θ_{z}}{θ_{1}}$ is the proportional relationship between the included angle of the new composite line feature $θ_{z}$ and $θ_{1}$ . $\frac{1}{v}$ is the weighting coefficient of the distance between two line features, $0 < v < 1$ , $v = \frac{d_{s}}{l_{1}}$ is the proportional relationship between the longest line feature length L ₁ and the distance between two line features. Through the analysis of the above factors, the angle correction value of the line features grouping $G_{L_{1}}$ relative to the long line features is

δ = 1 - \frac{1}{1 + e^{- 5 λ_{i} + 7}}

$δ$ is the adaptive proportional coefficient function, where equation (12) is fitted by experimental curve data. It can be seen from the formula that the smaller the angle between the line features to be merged and L ₁, the shorter the length, the smaller the distance to L ₁, the larger the adaptive coefficient $δ$ , the greater the mergeability. $θ_{Z}$ is the angle screening threshold that measures the similarity of the line feature angles, and its value is set to $π / 90$ through experimental analysis. After angle screening, candidate line feature groupings $G_{L_{1 - i}^{}}$ are obtained as

G_{L_{1 - i}^{}} = {\forall L_{i} \in G_{L_{1}} : (| θ_{1} - θ_{i} | \leq δ * θ_{z})}

Endpoint screening

Restore L ₁ into the screening line feature group $G_{L_{1 - i}}$ to become a new line feature group $G_{L} = {L_{1}, {G_{L 1 - i}}^{L_{I - i} L I - i}}$ and filter the endpoints of all line features in the group. First, select the line features L_i closest to L ₁ and calculate the average position of the endpoints of the two line features, so as to obtain the head endpoints $(x_{A_{z},} y_{A_{z}})$ and the tail endpoints $(x_{B_{z},} y_{B_{z}})$ . Calculate and compare the distance between all the endpoints and the origin, and select the minimum for the first endpoint, and the maximum for the endpoints, so as to obtain the endpoints of the synthesized new line feature. According to experience, at least one of the endpoints of the newly synthesized line features comes from the longest line features

(x_{A_{z},} y_{A_{z}}) = min (\sqrt{x_{A_{1}}^{2} + y_{A_{1}}^{2}}, \sqrt{x_{A_{i}}^{2} + y_{A_{i}}^{2}}, {\begin{matrix} \sqrt{{(x_{A_{1}} + \frac{| x_{A_{1}} - x_{A_{i}} |}{2})}^{2} + {(y_{A_{1}} + \frac{| y_{A_{1}} - y_{A_{i}} |}{2})}^{2}} x_{A_{1}} \geq x_{A_{1}}, y_{A_{1}} \geq y_{A_{i}} \\ \sqrt{{(x_{A_{1}} + \frac{| x_{A_{1}} - x_{A_{i}} |}{2})}^{2} + {(y_{A_{1}} - \frac{| y_{A_{1}} - y_{A_{i}} |}{2})}^{2}} x_{A_{1}} \geq x_{A_{1}}, y_{A_{1}} < y_{A_{i}} \\ \sqrt{{(x_{A_{1}} - \frac{| x_{A_{1}} - x_{A_{i}} |}{2})}^{2} + {(y_{A_{1}} + \frac{| y_{A_{1}} - y_{A_{i}} |}{2})}^{2}} x_{A_{1}} < x_{A_{1}}, y_{A_{1}} \geq y_{A_{i}} \\ \sqrt{{(x_{A_{1}} - \frac{| x_{A_{1}} - x_{A_{i}} |}{2})}^{2} + {(y_{A_{1}} - \frac{| y_{A_{1}} - y_{A_{i}} |}{2})}^{2}} x_{A_{1}} < x_{A_{1}}, y_{A_{1}} < y_{A_{i}} \end{matrix})

(x_{B_{z},} y_{B_{z}}) = max (\sqrt{x_{B_{1}}^{2} + y_{B_{1}}^{2}}, \sqrt{x_{B_{i}}^{2} + y_{B_{i}}^{2}}, {\begin{matrix} \sqrt{{(x_{B_{1}} + \frac{| x_{B_{1}} - x_{B_{i}} |}{2})}^{2} + {(y_{B_{1}} + \frac{| y_{B_{1}} - y_{B_{i}} |}{2})}^{2}} x_{B_{1}} \geq x_{B_{1}}, y_{B_{1}} \geq y_{B_{i}} \\ \sqrt{{(x_{B_{1}} + \frac{| x_{B_{1}} - x_{B_{i}} |}{2})}^{2} + {(y_{B_{1}} - \frac{| y_{B_{1}} - y_{B_{i}} |}{2})}^{2}} x_{B_{1}} \geq x_{B_{1}}, y_{B_{1}} < y_{B_{i}} \\ \sqrt{{(x_{B_{1}} - \frac{| x_{B_{1}} - x_{B_{i}} |}{2})}^{2} + {(y_{B_{1}} + \frac{| y_{B_{1}} - y_{B_{i}} |}{2})}^{2}} x_{B_{1}} < x_{B_{1}}, y_{B_{1}} \geq y_{B_{i}} \\ \sqrt{{(x_{B_{1}} - \frac{| x_{B_{1}} - x_{B_{i}} |}{2})}^{2} + {(y_{B_{1}} - \frac{| y_{B_{1}} - y_{B_{i}} |}{2})}^{2}} x_{B_{1}} < x_{B_{1}}, y_{B_{1}} < y_{B_{i}} \end{matrix})

Check the characteristic included angle of synthetic line

The angle between the newly synthesized line feature L_Z , and the horizontal direction is $θ_{Z}$ . The absolute angle difference between the angle $θ_{Z}$ and the included angle $θ_{1}$ is less than or equal to $\frac{θ_{s}}{2}$ , it is considered that the composite line feature is successfully merged and replaces the positions of the original line features L_i and L ₁. If the absolute difference of the angles is greater than $\frac{θ_{s}}{2}$ , it means that the synthesized line feature L_Z deviates from the longer line feature L ₁, so L_Z is discarded

G_{L_{Z}} = {\forall L_{i} \in G_{L_{1 - i}} : | θ_{Z} - θ_{1} | \leq \frac{θ_{s}}{2}}

Repeat the above steps until all line features in the group are filtered.

To verify the performance of this algorithm, the LSD and the improved algorithm are used to extract the number of line features in two different scenes for comparison, as shown in Figure 7. And for scene 1, the number of line features extracted from images with different heights is counted, which proves the effectiveness of improved algorithm, as shown in Table 1.

Figure 7.

Line features extraction results in different scenes: (a) scene 1, (b) scene 2, (c) LSD extracts line features in scene 1, (d) LSD extracts line features in scene 2, (e) line features extracted by our algorithm in scene 1, and (f) line features extracted by our algorithm in scene 2. LSD: line segment detector.

Table 1.

Features extraction quantity of different height lines.

Height from ground (m)	LSD line features number	The improved algorithm line features number	Compared with LSD, improved algorithm reduces the number of line features (%)
30	1101	889	19.26
60	1248	923	26.04
90	1315	1025	22.05
125	1196	891	24.96
160	1270	924	27.24

LSD: line segment detector.

It can be seen from Figure 7 that there are more long line features (roads and riversides) in scene 1 and dense weeds in scene 2. In the two scenarios, this article uses the line geometric characteristics to filter the line features, which could merge the shorter lines into long line features, and at the same time uses the local dense filter to filter out the short line features around them (such as the box area of Figure 7(e) and (f)), which effectively reduces the number of line features extracted from the image, and makes the line features more stable. Although the line features are still discontinuous in the image, it is due to the removal of too many dense short line features in this area, which causes the distance between similar long line features and could not be merged. However, the merged long line features are stable enough in the image to match successfully and effectively improve the accuracy of the heading attitude of the airdrop cargo platform.

According to Table 1, the number of line features extract by improved algorithm of different heights has decreased, and the maximum decrease relative to the number of line features extract by LSD can reach 27.24%. When the airdrop cargo platform collected ground features, the same feature is displayed differently in the image at different landing heights. The higher the airdrop cargo platform is from the ground, the wider the field of view of the airdrop camera, and the more features on the ground. However, some line features are displayed smaller in the image, so they are rejected as short line features. With the decrease of height, the line features are clearer and more stable in the image and could not be eliminated, so the number of line features that are eliminated at 30 m from the ground is the least.

Improved RANSAC to remove mismatching

The accuracy of airdrop cargo platform attitude estimation based on line features depends on the accuracy of features matching. If there are many wrong matches, there are errors in the attitude calculation of the airdrop cargo platform. Therefore, before the attitude calculation of the airdrop cargo platform, the RANSAC^29,30 is used to remove the wrong matching in the image.

RANSAC eliminates outliers through step-by-step iteration. If there are too many wrong matching line features in the observed data, the iteration accuracy decreases. The geometric constraints between the line features are used to prescreen out the line features that could not be matched successfully. The movement of the airdrop camera between two adjacent frames is small, and the constraint is completed by the geometric characteristics between the line features. For example, the position, length, and angle of the line features of two adjacent frames should be approximately the same.

We use the parallel relationship between the two line features, the length approximation and the overlap to eliminate the mismatch of line features, so as to remove the unmatched line features in advance and reduce the number of iterations of the RANSAC.

Matching line features have a parallel relationship

| θ_{u} - θ_{v} | \leq α

Matching line features have length similarity

ρ_{u v} = \frac{min (| L_{u} |, | L_{v} |)}{max (| L_{u} |, | L_{v} |)}

Matching line features have length overlap

η_{u v} = \frac{{(x_{u 1}, y_{u 1}), (x_{u 2}, y_{u 2}) \dots (x_{u n}, y_{u n})} \cap {(x_{v 1}, y_{v 1}), (x_{v 2}, y_{v 2}) \dots (x_{v m}, y_{v m})}}{max {n, m}} n, m = 1, 2, \dots, + \infty

To test whether the improved algorithm can effectively delete the mismatching, the line features of the image of shooting scene 1 are extracted and matched, as shown in Figure 8.

Figure 8.

Matching results of line features: (a) LBD matching results, (b) the mismatching result is removed by RANSAC, and(c) the matching results of improved algorithm in this article. LBD: line band descriptor; RANSAC: random sample consensus.

It can be seen from Figure 8 that the improved algorithm in this article can reduce the number of matching line features and effectively reduce mismatches during the matching process. Using LBD to directly match the screened effective line features, it can be seen that the line features are distributed in all directions. Figure 8(b) shows the result of using the RANSAC to remove mismatches. Most mismatches can be deleted, but there are a few mismatches with inconsistent line feature directions. In this article, the improved algorithm constrains the geometric characteristics of the line feature and further removes the mismatch. It can be seen from Figure 8(c) that the matched line feature pairs appear in the same direction, which greatly reduces the number of mismatches.

As shown in Table 2, because there are many line features in the scene, the matching accuracy without mismatching screening is 41.18%, and the number of mismatching is large. The matching accuracy of RANSAC to delete mismatching is 60.43%, and the matching accuracy rate after the geometric constraint screening can reach 75%. The improved RANSAC algorithm can obtain accurate matching results, which greatly improves the accuracy of attitude calculation of airdrop cargo platform.

Table 2.

Statistics of linear features matching results.

	Match logarithm	Match logarithm correctly	Matching accuracy (%)
Scene 1
LBD algorithm	204	84	41.18
RANSAC algorithm	139	84	60.43
Improved algorithm in this article	112	84	75.00

LBD: line band descriptor; RANSAC: random sample consensus.

Attitude calculation

Line features constraint equation

In the image sequence, there will be a large number of line features in each frame. It is necessary to establish the constraint equation of line features to solve the homography matrix H and decompose the H to obtain the final attitude of the airdrop cargo platform.

In the ith image, the linear equation of the a_i th line features is expressed as $b x + c y + 1 = 0$ . The line equation is converted into a matrix, expressed as a constraint equation

[\begin{matrix} b & c & 1 \end{matrix}] [\begin{matrix} x \\ y \\ 1 \end{matrix}] = 0

It can be further expressed as

p^{T} x_{i} = 0, p^{T} = [\begin{matrix} b & c & 1 \end{matrix}]

At the same time, in the i + 1th image, the d_i th line features of frame are successfully matched with the a_i th line features in the ith frame image, and linear equation is expressed as $e x + f y + 1 = 0$ . The line equation is converted into a matrix, expressed as a constraint equation

[\begin{matrix} e & f & 1 \end{matrix}] [\begin{matrix} x \\ y \\ 1 \end{matrix}] = 0

It can be further expressed as

q^{T} x_{i + 1} = 0, q^{T} = [\begin{matrix} e & f & 1 \end{matrix}]

According to homography, the two line features should meet the constraints of $λ x_{i + 1} = H_{L} x_{i}$ . The two ends of the equation are left multiplied by the normal vector $q^{T}$

q^{T} λ x_{i + 1} = q^{T} H_{L} x_{i}

q^{T} x_{i + 1} = λ^{- 1} q^{T} H_{L} x_{i}

According to $q^{T} x_{i + 1} = 0$ , it can be known that formula (25) can be equal to

λ^{- 1} q^{T} H_{L} x_{i} = 0

According to formulas (21) and (26), it can be deduced that

λ^{- 1} q^{T} H_{L} = p^{T}

The two ends of formula (27) are transformed to obtain

H_{L}^{T} q = λ p

To eliminate the scale factor $λ$ , cross multiply the two ends of formula (28) and get

p \times H_{L}^{T} q = 0

The homography H directly describes the transformation between image coordinates p ₁ and p ₂, and the middle part is denoted as H. The definition of homography is related to rotation matrix, translation matrix, camera internal parameters, and plane parameters

\begin{array}{l} p_{2}; K (R P + t) \\ ; K (R P + t \cdot (- \frac{n^{T} P}{d})) \\ ; K (R - \frac{t n^{T}}{d}) P \\ ; K (R - \frac{t n^{T}}{d}) K^{- 1} p_{1} \\ ; K H K^{- 1} p_{1} \end{array}

H = R - \frac{t n^{T}}{d}

Here H_L is to describe the conversion relationship of the same line features in different frames $H_{L} = K H K^{- 1}$ . The camera internal parameters need to be included between the homography matrices:

{(K H K^{- 1})}^{T} q = λ p

H^{T} K^{T} q = λ K^{T} p

Let $K^{T} q = q^{'}$ and $K^{T} p = p^{'}$ , simplify

H^{T} q^{'} = λ p^{'}

Cross multiplication is available

p^{'} \times H^{T} q^{'} = 0

Expand the matrix $p^{'}$ , write it in vector form $p^{'} = {[p_{1}, p_{2}, 1]}^{T}$ , and expand the matrix $q^{'}$ into vector form $q^{'} = {[q_{1}, q_{2}, 1]}^{T}$ to get the constraint equation of line features

[\begin{matrix} - p_{2} & p_{1} & q_{2} p_{1} & q_{2} p_{2} & q_{2} \\ p_{1} & p_{2} & - q_{1} p_{1} & - q_{1} p_{2} & - q_{1} \\ p_{2} & p_{1} & - q_{1}^{2} & - p_{2}^{2} & q_{2} \\ - p_{1} & p_{2} & - p_{1}^{2} & q_{2}^{2} & q_{1} \\ p_{1} & p_{2} & q_{2}^{2} & p_{2}^{2} & - q_{2} \end{matrix}] [\begin{matrix} h_{1} \\ h_{2} \\ h_{3} \\ h_{4} \\ h_{5} \end{matrix}] = 0

Decompose homography

The homography H is regarded as a vector, which is recovered by solving the linear equation of the vector. The homography H can be decomposed to get the corresponding rotation matrix R and translation vector t. According to the form of homography H, it can be expressed as

H = [\begin{matrix} h_{1} & - h_{2} & h_{3} \\ h_{2} & h_{1} & h_{4} \\ 0 & 0 & h_{5} \end{matrix}]

There are five unknowns, and if there are more than five line features that are successfully matched, then each element of the homography H can be solved by the above equation and decomposed to get the corresponding rotation matrix R and translation vector t

R_{L} = [\begin{matrix} h_{1} & - h_{2} & 0 \\ h_{2} & h_{1} & 0 \\ 0 & 0 & 0 \end{matrix}]

t_{L} = {[\begin{matrix} h_{3} & h_{4} & h_{5} & - 1 \end{matrix}]}^{T}

Experiments

In this part, the experiment is to verify the effectiveness and real-timeness of the method for solving the heading attitude of airdrop cargo platform based on line features. The improved algorithm in this article, traditional line features LSD extraction, and LBD matching are compared with point features. And calculate the deviation between the estimated value output by the vision algorithm and the real value output by the IMU. Finally, the performance of the vision algorithm in terms of processing time is discussed in the Time performance experience section.

Materials and equipment

Since the drop of the airdrop cargo platform requires a military assistance and is confidential, it is not convenient to disclose the test data. Therefore, the DJI Mavic Air 2 UAV is used to simulate the landing process of the airdrop cargo platform after the parachute is opened. The DJI MAVIC 2 UAV is equipped with a camera (model FC6310; DJ-Innovations), GPS/GLONASS dual-mode satellite positioning module, IMU, and barometric altimeter sensors. The angular rate sensor integrated in the IMU module uses the (model ADXRS620; Analog Devices), and the most important feature of this series of sensors is that it consists of vibration suppression and high-impact resistant properties. Furthermore, it has a full-range gyroscope of 300°/s, which could provide accurate UAV three-axis attitude values for the experiments. The altitude barometer and IMU output data are provided by DJI + Assistant software. The position of the IMU in the UAV relative to the camera reference system is represented by the transformation matrix $T_{IMU}^{CAM}$ , as shown in Table 3. Since the current research is not fully implemented the integration of vision algorithms with flight control and is under further study, the following experimental procedure only captures the aerial images through the UAV onboard monocular camera, followed by simulation experiments by the computer. All the experiments are conducted on a computer equipped with an AMD Ryzen 7 4800H 4.20 GHz CPU (AMD, California, USA), and NVIDIA GeForce RTX2060 GPU (Nvidia, California, USA), and memory 16G devices. The CPU programs are executed in a single thread. The image processing is based on OpenCV3.3.0, which is programmed in Microsoft Visual Studio2015.

Table 3.

The parameters used in the experiments.

Parameter	Value
Camera intrinsic parameters	Focal length	f_x = 3655.62 pixel, f_y = 3648.68 pixel
	Principal point	C_x = 354.571 pixel, C_y = 408.870 pixel
	Radial distortion	K ₁ = −0.2407, K ₂ = 0.1225
	Camera resolution	20 m: 1280 × 720
	Camera resolution	160 m: 3840 × 2160
	Pixel size	2.41 µm
Camera and IMU transformation matrix	$T_{IMU}^{CAM}$	$[\begin{matrix} 0.0124 & - 0.8864 & 0.0365 & - 0.012 \\ - 0.8873 & - 0.0186 & 0.0135 & - 0.018 \\ - 0.0159 & - 0.0292 & - 0.8878 & - 0.058 \\ 0 & 0 & 0 & 1 \end{matrix}]$

IMU: inertial measurement unit.

In the experiment, the image sequence of scene 1 is evaluated. Scene 1 contains a large number of linear features, such as riversides, roads, and house edges, which can provide a large number of stable linear features. During the landing of the UAV, data statistics are made on different landing heights and the experimental results are recorded. The roll ϕ, pitch θ, and yaw ψ of the UAV calculated by the visual are used as the estimated values, and attitude measured by airdrop IMU is used as the true value. Finally, the root-mean-square error (RMSE) between the solution attitude and the three-axis attitude of the IMU is calculated to measure the deviation between the observed value and the true value. The RMSE is taken as the standard to evaluate the accuracy, and then the average value of RMSE after five experiments is used as the final angle error result of the image sequence

RMSE ({\hat{θ}}_{i}, θ_{i}) = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(trans ({\hat{θ}}_{i}) - trans (θ_{i}))}^{2}}

Equation ( 39) gives the calculation formula of RMSE, where i represents n frames of images to be processed, $θ_{i}$ represents the attitude of the real airdrop cargo platform, and ${\hat{θ}}_{i}$ represents the three-axis attitude calculation based on visual algorithm. And in the formula, trans is the root-mean-square of Euclidean distance.

Validity experiment of image geometric transformation

During the landing process of the airdrop cargo platform, the acquired images are subject to affine transformation, perspective transformation, and scale transformation. Setting up this experiment is to prove the effectiveness of this attitude calculation in geometric transformation. The experimental image is taken at a height of 120 m from the ground. The affine transformation and perspective transformation function in OpenCV are used to transform the image. The scale transformation image is taken at a height of 160 m from the ground. The image processing result is shown in Figure 9.

Figure 9.

Line features extraction results in geometric transformation: (a) scene image, (b) perspective transformation, (c) affine transformation, and (d) scale transformation.

It can be seen from Figure 9 that the affine transformation and perspective transformation of the image affect the distribution of line features (angle, length, and position in the images), which makes some line features unstable, resulting in a small reduction in the number of line features detected in the image. However, the longer line features are not affected by geometric transformation. In the image geometric transformation, the improved algorithm can still detect abundant line features, which is enough to support the attitude calculation of the airdrop cargo platform. It can verify the robustness of the improved line features in the scene and make the detection line features more reliable.

Attitude estimation experiment based on line features

To verify the effectiveness and accuracy of the visual algorithm for UAV attitude estimation based on line features. UAV landed at a height of 160 m above the ground, the flight distance is long, and a vertical landing speed is about 1.6 m/s. The camera’s sampling frequency is low, with a resolution of 960 × 540. Compare the visually calculated attitude angle and the real attitude of the airborne IMU output over time, and calculate the RMSE value, as shown in Figure 10.

Figure 10.

(a)–(c) are graphs UAV’s roll ϕ, pitch θ, and yaw ψ $ψ$ calculated based on line features.

Figure 10 illustrates the relationship between UAV attitude error and landing time. The red line represents the fitting curve of the real-time attitude angle of UAV landing estimated based on line features, the blue line represents the fitting curve of real value output by IMU, and the yellow line represents the fitting curve of UAV attitude angle estimated by LSD and LBD. It can be seen from the observation that the RMSE values of roll, pitch, and yaw of UAV are 1.715°, 1.698°, and 0.886°, respectively, based on the improved line features. The RMSE values of roll, pitch, and yaw of UAV are 3.524°, 3.165°, and 1.848°, respectively, by LSD and LBD. Compared with the traditional algorithm, the accuracy of the improved algorithm is more accurate and closer to the real value of IMU measurement.

Line features and point features comparison experiment

In scene 1, the UAV is used to collect the image sequence of landing. For the improved algorithm in this article, the traditional line feature extraction (LSD + LBD) and the oriented fast and rotated brief (ORB) feature point extraction,³¹ and performing the brute force matching³¹ is calculated to obtain the angle accuracy for comparison. The results of point feature extraction and matching are shown in Figure 11. The RMSE values of roll ϕ, pitch θ, and yaw ψ at different landing heights of the airdrop cargo platform are counted, as shown in Figure 12.

Figure 11.

ORB point features extraction and matching.

Figure 12.

RMSE of attitude angle calculated by three algorithms at different heights.

As shown in Figure 12, the improved line feature extraction algorithm in this article is compared with the algorithm that only extracts point features, RMSE is reduced by 48.71%. Compared with LSD and LBD, RMSE is reduced by 22.85%. The accuracy of calculating the attitude based on the geometric characteristics of the line features is significantly improved compared with other methods, which proves that the algorithm in this article has a higher accuracy and can be used for the attitude calculation during the landing of the airdrop cargo platform.

When the airdrop cargo platform reaches the initial landing height, imaging at a high altitude, and the slight changes of attitude angle and the terrain fluctuation affect the imaging bottom point to generate a large deviation, which led to poor point features extraction and track failure, and the accuracy error of solving attitude angle is too large to ensure accurate landing. There are a lot of interference information such as short line features in the airdrop scene, and there are too many line features with inconspicuous gradient in the high altitude, which increases the mismatching of line features and the amount of calculation. The improved algorithm in this article can eliminate short line features in local dense areas and use geometric features of line features to group and merge the remaining line features, which makes the long line features more stable in the image, avoids the interference of short line features in the matching process, and effectively improves the matching accuracy of line features.

Time performance experience

To verify the real-time performance of the improved algorithm, the time performance of the vision algorithm is tested to avoid the imbalance of UAV. The traditional line feature extraction (LSD + LBD) and the improved algorithm are used to calculate the landing time of the UAV, respectively. The real-time performance of the algorithm is measured according to the single-frame pose solving time, as shown in Table 4.

Table 4.

Average attitude solution time

	Image resolution	Average single-frame extraction time (ms)	Average single-frame matching time (ms)	Single-frame pose solving time (ms)	Processing frame number frame (s)
LSD + LBD	960 × 540	610.17	40.12	663.23	150
Ours	960 × 540	301.56	38.49	348.69	287
LSD + LBD	3840 × 2160	8678.51	50.34	8750.15	11
Ours	3840 × 2160	4009.89	46.10	4076.45	24

LSD: line segment detector; LBD: line band descriptor.

It can be seen from Table 4 that in processing the image resolution of 960 × 540, the average single-frame line features extraction takes 301.56 ms, the line features matching takes 38.49 ms, and the total single-frame attitude angle settlement takes 348.69 ms, including line feature extraction, matching, and attitude angle resolution time of 13.7 ms. Compared with the traditional line feature extraction (LSD + LBD), the single-frame solution time is saved by 314.54 ms, and 287 frames of images can be processed during the landing of the airdrop cargo platform. At the image resolution of 3840 × 2160, the average single-frame line feature extraction takes 4009.89 ms, the line feature matching takes 46.10 ms, and the total single-frame attitude angle settlement takes 4076.45 ms, including line feature extraction, matching, and attitude angle resolution time of 20.46 ms. Compared with the traditional line feature extraction (LSD + LBD), the single-frame solution time is saved by 4673.7 ms, and 24 frames of images can be processed during the landing of the airdrop cargo platform. Obviously, the improved algorithm in this article is faster in solving attitude based on the geometric features of the line feature and saves a lot of calculation time. It has certain reference significance for the study of the attitude solution of the landing airdrop cargo platform.

In the aspect of time-consuming for line features matching, this article uses the geometric characteristics of line features between adjacent frames to further eliminate mismatching. Although it does not save too much time, it reduces mismatches and improves the accuracy of attitude solution. In the aspect of time-consuming for line features extraction, more line features will be processed in high-resolution images, and the extraction of line features accounts for most of the time spent in single-frame attitude solution. A large number of line features increase the time for extracting line time per frame, which makes the airdrop system unable to meet the real-time requirements. At a resolution of 960 × 540, some short line features cannot be displayed in the image, which naturally reduces the number of line features. Through the screening of line features, a large number of short line features are eliminated, and some short line features are merged, which reduces the number of line features to be extracted in the image again, saves the time of attitude solution, and improves the real-time performance of the system.

Conclusions and future work

This article proposes a vision algorithm for airdrop cargo platform attitude estimation and autonomous landing based on line features. The line features in the airdrop landing scene are extracted as visual goals, but too many invalid line features in the scene increase the computational complexity of the algorithm. Therefore, this article uses the geometric characteristics of line features to improve the traditional line feature extraction and eliminates the locally dense line features in the image, which greatly reduces the number of line features in the image. Then, the improved RANSAC algorithm is used to remove the mismatching of line features, which improves the accuracy of attitude angle, and makes up for the problems of difficult point features extraction or low matching accuracy in airdrop environment. Finally, a constraint equation is established for the successfully matched line features, and the homography matrix is solved and decomposed to obtain the real-time triaxial attitude angle of the airdrop cargo platform.

To illustrate the performance of the airdrop cargo platform attitude estimation system based on line features, outdoor experiments are conducted and analyzed. Comprehensive experimental results show that the maximum reduction of the number of line features extracted by the improved line features algorithm compared with LSD can reach 27.24%, the matching accuracy after geometric constraint screening can reach 75%, and the line features extraction and matching effects have been significantly improved. The RMSE values of roll, pitch, and yaw of UAV are 1.715°, 1.698°, and 0.886°, respectively, based on line features, and the average single-frame attitude angle resolution time is 348.69 ms, which saves 314.54 ms compared with the traditional line feature extraction (LSD + LBD). Although this article is a preliminary work, it demonstrates the feasibility of the attitude angle based on visual algorithm of the landing process of airdrop cargo platforms.

Future work will consider the poor real-time processing of images for the airdrop cargo platform, optimize the algorithm, and divide the two processes of feature detection and pose estimation into two independent threads. The segmentation/line/transformation prediction model based on convolutional neural network will be studied later. The combination of the vision algorithm and the flight control system further proves its reliability and accuracy during airdrop landing.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Hongying Zhang

References

Liu

Sun

Wang

. Heavyweight airdrop flight control design using feedback linearization and adaptive sliding mode. Trans Inst Meas Control 2016; 38(10): 1155–1164.

Desabrais

Riley

Sadeck

, et al. Low-cost high-altitude low-opening cargo airdrop systems. J Aircr 2012; 49(1): 349–354.

Al-Kaff

Martin

Garcia

, et al. Survey of computer vision algorithms and applications for airdrop cargo platform. Expert Syst Appl 2018; 92: 447–463.

Kendoul

. Survey of advances in guidance, navigation, and control of unmanned rotorcraft systems. J Field Robot 2012; 29(2): 315–378.

Meng

Wang

Han

, et al. A visual/inertial integrated landing guidance method for UAV landing on the ship. Aerosp Sci Technol 2019; 85: 474–480.

Tzoumanikas

Grimm

, et al. Fully autonomous micro air vehicle flight and landing on a moving target using visual-inertial estimation and model-predictive control. J Field Robot 2019; 36(1): 49–77.

Araar

Aouf

Vitanov

. Vision based autonomous landing of multirotor UAV on moving platform. J Intell Robot Syst 2017; 85(2): 369–384.

Yang

Sun

. A fuzzy complementary Kalman filter based on visual and IMU data for UAV landing. Optik 2018; 173: 279–291.

Zhang

Zhai

, et al. Infrared-inertial navigation for commercial aircraft precision landing in low visibility and GPS-denied environments. Sensors 2019; 19(2): 408.

10.

Wang

Deng

. A software platform for vision-based UAV autonomous landing guidance based on markers estimation. Sci China-Technol Sci 2019; 62(10): 1825–1836.

11.

Lopez

Garcia

Barea

, et al. A multi-sensorial simultaneous localization and mapping (SLAM) system for low-cost micro aerial vehicles in GPS-denied environments. Sensors 2017; 17(4): 802.

12.

Meng

Kong

Meng

, et al. Camera motion estimation and optimization approach. In: 2019 International conference on advanced mechatronic systems, ICAMechS 2019, Kusatsu, Japan, 26–28 August 2019. pp. 212—217. IEEE Computer Society.

13.

Lee

Hwang

SS.

Elaborate monocular point and line SLAM with robust initialization. In: 17th IEEE/CVF international conference on computer vision, ICCV 2019, Seoul, Korea, 27 October–2 November 2019. Republic of Korea: Institute of Electrical and Electronics Engineers Inc.

14.

Yao

, et al. Hierarchical line matching based on line-junction-line structure descriptor and local homography estimation. Neurocomputing 2016; 184: 207–220.

15.

Zhang

Rui

Yang

, et al. LAP-SLAM: a line-assisted point-based monocular VSLAM. Electronics 2019; 8(2): 243.

16.

Teng

Luo

, et al. Aircraft pose estimation based on geometry structure features and line correspondences. Sensors 2019; 19(9): 2165.

17.

Cheng

. An efficient and globally optimal method for camera pose estimation using line features. Mach Vis Appl 2020; 31(6): 48.

18.

. An infrared small target detection method based on local contrast measure and gradient property. In: 2020 Fifth international seminar on computer technology, mechanical and electrical engineering, ISCME 2020, Shenyang, Virtual, China, 30 October–1 November 2020, 2021. IOP Publishing Ltd.

19.

Zeng

Huang

, et al. A multi-feature fusion slam system attaching semantic invariant to points and lines. Sensors 2021; 21(4): 1196.

20.

Zuo

Xie

Liu

, et al. Robust visual SLAM with point and line features. In: 2017 IEEE/RSJ international conference on intelligent robots and systems, IROS 2017, Vancouver, BC, Canada, 24–28 September 2017. Institute of Electrical and Electronics Engineers Inc.

21.

Pumarola

Vakhitov

Agudo

, et al. PL-SLAM: real-time monocular visual SLAM with points and lines. In: 2017 IEEE international conference on robotics and automation, ICRA 2017, Singapore, Singapore, 29 May–3 June 2017. Institute of Electrical and Electronics Engineers Inc.

22.

Wang

Wan

, et al. Improved point-line feature based visual SLAM method for indoor scenes. Sensors 2018; 18(10): 3559.

23.

Zhao

Guo

, et al. PL-VIO: tightly-coupled monocular visual-inertial odometry using point and line features. Sensors 2018; 18(4): 1159.

24.

Gomez-Ojeda

Gonzalez-Jimenez

Geometric-based line segment tracking for HDR stereo sequences. In: 2018 IEEE/RSJ international conference on intelligent robots and systems, IROS 2018, Madrid, Spain, 1–5 October 2018. Institute of Electrical and Electronics Engineers Inc.

25.

Zhou

Zhang

Deng

, et al. Improved point-line feature based visual SLAM method for complex environments. Sensors 2021; 21(13): 4604.

26.

Grompone von Gioi

Jakubowicz

Morel

J-M

, et al. LSD: a fast line segment detector with a false detection control. IEEE Trans Pattern Anal Mach Intell 2010; 32(4): 722–732.

27.

Zhang

Koch

. An efficient and robust line segment matching approach based on LBD descriptor and pairwise geometric consistency. J Vis Commun Image Represent 2013; 24(7): 794–805.

28.

Salaun

Marlet

Monasse

. The multiscale line segment detector. In: First workshop on reproducible research in pattern recognition, RRPR 2016, Cancun, Mexico, 4 December 2016, 2017. Springer Verlag.

29.

Lebeda

Matas

Chum

. Fixing the locally optimized RANSAC. In: 2012 23rd British machine vision conference, BMVC 2012, Guildford, Surrey, United Kingdom, 3–7 September 2012. British Machine Vision Association, BMVA.

30.

Sattler

Leibe

Kobbelt

. SCRAMSAC: improving RANSAC’s efficiency with a spatial consistency filter. In: 12th international conference on computer vision, ICCV 2009, Kyoto, Japan, 29 September–2 October 2009. Institute of Electrical and Electronics Engineers Inc.

31.

Rublee

Rabaud

Konolige

, et al. ORB: an efficient alternative to SIFT or SURF. In: 2011 IEEE international conference on computer vision, ICCV 2011, Barcelona, Spain, 6–13 November 2011. Institute of Electrical and Electronics Engineers Inc.