Abstract
Deep classification tracking aims at classifying the candidate samples into target or background by a classifier generally trained with a binary label. However, the binary label merely distinguishes samples of different classes, while inadvertently ignoring the distinction among the samples belonging to the same class, which weakens the classification and locating ability. To cope with this problem, this article proposes a soft labeling with quasi-Gaussian structure instead of the binary labeling, which distinguishes the samples belonging to different classes and the same class simultaneously. Like as the binary label, the signs of labels for target and background samples are set to be plus and minus respectively to distinguish samples of different classes. Further, to exploit the difference among samples in the same class, the label values of samples in the same class are designed as a monotonically decreasing quasi-Gaussian function about Intersection over Union. Therefore, the corresponding response function is a two-piecewise monotonically increasing quasi-Gaussian combination function about Intersection over Union. Due to such response function, deep classification tracking trained with this proposed soft labeling achieves better classification and location performance. To validate this, the proposed soft labeling is integrated into the pipeline of the deep classification tracker SiamFC. Experimental results on OTB-2015 and VOT benchmark show that our variant achieves significant improvement to the baseline tracker while maintaining real-time tracking speed and acquires comparable accuracy as recent state-of-the-art trackers.
Introduction
Visual object tracking, one of the most important tasks in many robot applications, has been widely used in many fields such as intelligent manufacturing, human–computer interaction, video surveillance, and robotics. 1 –3 It is an indispensable part of robots 4,5 serving as the “eye” for robots to communicate with the world as the Figure 1 shown.

Applications of visual target tracking in robots. (a) Humanoid robot, (b) unmanned surface vessel (USV), and (c) mobile robots with RGB-D cameras.
Visual object tracking mainly contains five components: motion model, feature extractor, observation model, model updater, and ensemble post-processor, where feature extractor has the greatest impact on visual tracking performance. 6 Due to the powerful representation of deep networks, deep object tracking 7 –12 has become the research hotspot and the state-of-the-art algorithm in the field of visual tracking. Generally, deep object tracking can be divided into two main categories: deep regression tracking 7,8,13 –15 and deep classification tracking. 9 –11,16 Deep regression tracking outputs a response map through a regressor that learns a mapping between input deep features and the soft label. Deep classification tracking treats object tracking as an object and background two-category problem based on deep features, which classifies the samples into target or background through a classifier usually trained with the binary labeling. Recently, with the development of deep classification tracking, it has been able to achieve the real-time while ensuring certain tracking performance.
Other than Gaussian soft label of deep regression tracking, the label used in deep classification tracking is the binary label {−1,+1} or {(1,0),(0,1)}. The samples with Intersection over Union (IoU) values greater than the threshold are considered as target samples, whose labels are set to +1 or (1,0). The other ones are considered as background samples, whose labels are set to −1 or (0,1). Although such binary label has the ability to distinguish the samples of different classes but inadvertently overlooks the difference among samples in the same class. This drawback makes the response map of deep classification tracking difficult to accurately reflect the target location. As shown in Figure 2(a), Deep classification tracker SiamFC 11 trained with the binary labeling can discriminate between target and background samples, but the maximum position of its response map does not correspond to the target position accurately, which results in the target drift problem. As the tracking phase advances, the drift will accumulate and affect the subsequent frames. What’s more, neglecting the difference among samples within the same class in the training phase will reduce the classification ability of the tracker. As shown in Figure 2(b), SiamFC trained with the binary labeling misjudges the target and background samples due to such information neglect.

Search regions and corresponding response score maps of SiamFC trained with the binary labeling and our proposed soft labeling with quasi-Gaussian structure in the CarDark (a) and DragonBaby (b) sequences. The response value of samples in the search region is denoted by a point in the score map with the same color.
How to design a special labeling to solve the above problem? As we know, IoU characterizes the overlap rate between samples and the target, which can represent the probability of samples as the target to some extent. Inspired by this, this article uses the IoU values as the design criteria and proposes a novel soft labeling with quasi-Gaussian structure instead of the binary labeling to distinguish samples belonging to different classes and the same class simultaneously. Thus deep classification tracking trained with this proposed soft labeling performs better classification and locating ability as shown in Figure 2.
In the rest of this article, related work is introduced in the second section. The third section describes the proposed soft labeling with quasi-Gaussian structure and applies it to the deep classification tracker SiamFC. Then we compare and analyze the variant with its baseline tracker and the state-of-the-art trackers on popular tracking benchmarks: OTB-2015 and VOT, in the fourth section. Lastly, we conclude this article in the fifth section.
The main contributions of this article are summarized below: A novel soft labeling with quasi-Gaussian structure is proposed to replace the binary labeling to enhance the classification and locating ability of deep classification tracking. The proposed labeling further solves the shortcomings of the binary labeling that ignores the distinction among the samples belonging to the same class. A tracking algorithm is proposed to incorporate the soft labeling with quasi-Gaussian structure with the tracker SiamFC. Compared with the baseline tracker SiamFC, the proposed method achieves the significant improvement in terms of both accuracy and robustness on many existing popular benchmarks. Extensive experiments on OTB-2015 and VOT benchmark against many state-of-the-art trackers are performed, and tracking results demonstrate the superiority and efficiency of our proposed tracking algorithm.
Related work
In 2012, AlexNet 17 won the ILSVRC-2012 18 competition and showed the powerful representational capabilities of deep features to the world. Since then, deep object tracking 7,9,15,16,19 has emerged, which makes the field of visual object tracking a leap. Deep object tracking replaces manual features 20,21 with the more powerful deep features as representation and achieves more remarkable performance than traditional object tracking. 22 –27 According to the different nature, deep object tracking can be classified into two main categories: deep regression tracking and deep classification tracking.
Deep regression tracking
Deep regression tracking outputs a response map through a regressor that learns a mapping between input deep features and the soft label. According to the different mapping methods, deep regression trackers can be mainly divided into DCF-based deep regression trackers, 7,8,13,15,28 deep regression trackers based on convolutional regression networks 14,29,30 and deep regression trackers based on the Siamese networks. 31 DCF-based deep regression trackers directly adopt VGG-M, 32 a convolutional neural network pre-trained on the multi-classification dataset, as feature extractor and then output the response map through an online learned regressor which regresses all the circularly shifted versions of the input image into Gaussian soft label. Deep regression trackers based on convolutional regression networks pre-trains the convolutional regression networks on the tracking dataset end-to-end to establish a mapping between the input image and the Gaussian soft label and then fine-tune the convolutional regression networks online as feature extractor and regressor simultaneously. Despite the top performance, DCF-based deep regression trackers and deep regression trackers based on convolution regression networks cannot achieve real-time performance. Other than the other two trackers, deep regression trackers based on the Siamese networks utilize Siamese networks pre-trained off-line on the tracking dataset as feature extractor and regressor simultaneously, which no longer fine-tunes the networks during the tracking phase to achieve the real-time. Although the deep regression trackers based on the Siamese networks achieve high real-time (100 Fps), their performance is not ideal. Overall, the existing deep regression tracking cannot achieve a good balance between accuracy and robustness on the one hand and real-time performance on the other.
Deep classification tracking
Deep classification tracking treats object tracking as a target and background two-category problem. It classifies the samples into target or background through a classifier usually trained with a binary label. Deep classification tracking mainly includes SVM-based deep classification trackers, 16 deep classification trackers based on multi-domain convolutional neural networks, 9,10,33,34 and deep classification trackers based on the Siamese networks. 11,12,35 –38 SVM-based deep classification trackers directly adopt R-CNN, 39 a convolutional neural network pre-trained on the multi-classification task dataset as the feature extractor and classify the samples into the target and background through the binary classifier SVM. Different from SVM-based deep classification trackers that can hardly benefit from end-to-end training, deep classification trackers based on multi-domain convolutional neural networks utilize the multi-domain convolutional neural networks as features extractor and binary classifier simultaneously to process the tracking task, which makes the end-to-end training possible. But to acquire the information about specific target and scenarios, they need to fine-tune the network online, which makes it difficult to achieve the real-time. Other than online fine-tuning, deep classification trackers based on Siamese networks obtain the specific information through the Siamese networks. Deep classification trackers based on Siamese networks utilize the Siamese networks to convert the target and samples to the same embedding space and then classify samples into target or background by similarity comparison. The early deep classification tracker based on Siamese networks SINT 35 has excellent tracking performance, but it is still far from being real-time due to the full connection layer and online update. Distinct from SINT, SiamFC 11 adopts a fully convolutional Siamese network and no longer update the neural network online so that its real-time (86.5 Fps) reaches the first place in the deep classification trackers at that time while simultaneously guaranteeing a certain tracking accuracy. Therefore, recently more and more deep classification trackers 12,36 –38 have been improved on SiamFC so as to achieve high real-time while ensuring the certain tracking accuracy. In general, with the development of the deep classification tracking, it has been able to achieve a good balance between tracking performance and the real-time, and have achieved the start-of-state results.
However, we note that the binary labeling for deep classification trackers distinguishes the difference among samples in different classes but inadvertently elides the difference among samples within the same class. The neglect of the difference among the target samples makes the response values of the target samples difficult to accurately reflect the target position and causes the target drift problem. What’s more, due to such information neglect in the training phase, the classification ability of the deep classification tracking weakens and the misjudgment arises. To cope with problems of the binary labeling in deep classification tracking, this article proposes a soft labeling with quasi-Gaussian structure instead of the binary labeling to enhance the classification and locating ability of the deep classification trackers. Compared with the binary labeling, the soft labeling with quasi-Gaussian structure adds more information about the difference among samples within the same class into the training phase while considering the difference among the samples in different classes simultaneously.
Soft labeling with quasi-Gaussian structure for deep classification tracking
We firstly describe the problems of the binary labeling and then propose a soft labeling with quasi-Gaussian structure for deep classification tracking. Lastly, we integrate the soft labeling into the pipeline of the deep classification tracker SiamFC to validate it.
Problems in the binary labeling for deep classification tracking
There are two kinds of binary labels for deep classification tracking, namely {−1,+1} and {(1,0),(0,1)}. Deep classification trackers only outputting positive scores of samples 11,12,35–38 generally adopt the {−1,+1} binary label while those outputting 2-D binary classification score 9,10,33,3 4 adopt the {(1,0),(0,1)} binary label, which is shown in the Figure 3(a) and (b). Moreover, as the Figure 3(c) shows, these two kinds of binary labels are essentially the same. For simplicity, we adopt the {−1,+1} binary label as representation for the problem description.

(a) Deep classification trackers only outputting positive scores of samples. (b) Deep classification trackers simultaneously outputting positive and negative scores. (c) The conversion method between the deep classification trackers trained with the {−1,+1} and {(1,0),(0,1)} binary label.
The logistic loss function corresponding to the {−1,+1} binary label is expressed as following
where yi
and vi
denotes the label value and the response value of the sample xi
respectively. Denoting
Theoretical derivation and experiment (see Appendix 1 for details) indicate that ti
will approximately converge to a constant c. Hence the response value vi
of target samples and background samples will converge to c and

Diagram of the deep classification trackers trained with the binary labeling.
Soft labeling with quasi-Gaussian structure for deep classification tracking
To overcome the drawbacks of the binary labeling, we propose a soft labeling with quasi-Gaussian structure instead of the binary labeling to enhance the classification and locating ability of deep classification tracking. The proposed soft labeling takes into account the difference among samples belonging to the same and different classes simultaneously. Like as the binary label, to distinguish samples of different classes, the signs of labels for the positive and negative samples are set to be plus and minus respectively. Further, to exploit the difference among samples in the same class, the label values of different samples belonging to the same class are no longer the same but related to their IoU values.
As analyzed above,
where
The function curve of soft labeling with the quasi-Gaussian structure and its corresponding response function curve are shown in Figure 5. Like the binary labeling, the response values of target and background samples are always positive and negative respectively so that the difference between them is large enough to distinguish them well. However, different from the binary labeling, this difference between the response values of target and background samples becomes more significant, which will enhance the classification ability of deep classification trackers. More importantly, different from the binary labeling, the response values of samples belonging to the same class are no longer the same, but positively correlated with their IoU values, which makes the target location more accurate.

Function curves of the soft labeling with quasi-Gaussian structure for deep classification tracking (a) and its corresponding response score (b).
Intuitively, Figure 6 shows the diagram of deep classification trackers trained with the soft labeling. Different from the trackers trained with the binary labeling shown in Figure 4, the deep classification trackers can exploit the difference among the samples of the same and different classes simultaneously in the training phase due to our proposed soft labeling. In the tracking phase, the sample with the maximum IoU value is preferred to regard as the target so that deep classification tracker can locate the target more accurately. Thus, the tracker can possess a better classification and locating ability. For the tracking speed, we only replace the binary label with our proposed soft labeling in the off-line training phase of deep classification trackers, which will not affect the amount of computation in the online tracking phase. Therefore, the tracking accuracy can be significantly improved by the soft labeling while the tracking speed is not affected.

Diagram of deep classification trackers trained with the proposed soft labeling.
SiamFC trained with the soft labeling
Deep classification tracker SiamFC is proposed by Luca Bertinet et al. in 2016. Due to the fully convolutional network and no online update, SiamFC becomes the most real-time deep classification tracker at that time. SiamFC transfers visual object tracking to a similarity problem in an embedding space through a fully convolutional Siamese network. It calculates the similarity between the target image patch and the candidate samples generated by the dense sampling and then tracks the object by regarding the sample with the highest similarity as the target. In the training phase, SiamFC adopts the {−1, +1} binary label and sets the label values of the samples according to the center distances between the samples and the searching region because the IoU values of the samples are negatively correlated to the center distance. The samples are considered to the positive samples if they are within the radius R of the center as the Figure 7(a) shows.

(a) The binary labeling for SiamFC. (b) The soft labeling with quasi-Gaussian structure for SiamFC. (c) Response map of SiamFC trained with the binary labeling. (d) Response map of SiamFC trained with the soft labeling.
In order to verify the effectiveness of the proposed soft labeling with quasi-Gaussian structure, we apply it to SiamFC, denoting the variant as SiamFC-label. Since the IoU value of the sample is negatively correlated to the center distance between the sample and the searching region in SiamFC, we set this relationship as
As the Figure 7(c) and (d) show, comparing with SiamFC, SiamFC-label has the following two advantages: (1) the response values for samples belonging to the same class are no longer the same but positively correlated with their IoU values; (2) the difference among samples of different classes is more significant. Due to such two advantages, SiamFC-label can locate the target more accurately and perform better classification ability in the online tracking phase as the Figure 2 shows. What’s more, only the parameter values of the pre-trained network are changed in the online tracking phase so that the amount of computation will not be affected. Therefore, SiamFC-label can perform significantly improved tracking accuracy while achieving high real-time performance.
Experiments
In order to evaluate the effectiveness of soft labeling with quasi-Gaussian structure, we compare the SiamFC-label with the baseline tracker and the state-of-the-art trackers on OTB-2015 40 and VOT 41 benchmark datasets. In this section, we firstly introduce the implementation details. Next, we compare the variant SiamFC-label with the baseline tracker on the popular benchmark datasets. Then, we evaluate our proposed method on OTB-2015 and VOT benchmark datasets in comparison with the state-of-the-art trackers. Lastly, we present extensive attribute-based performance analysis to further illustrate the effectiveness of our proposed soft labeling with quasi-Gaussian structure for improving the locating precision and classification ability of the deep classification trackers.
Implementation details
In this article, the experiments are conducted on the popular OTB-2015 and VOT-2016 benchmarks. The OTB-2015 benchmark contains 100 challenging sequences, which includes various tracking scenarios and challenges. The OTB-2015 benchmark provides two evaluating indicators, overlap success rate, and distance precision (DP). The overlap success plot shows the rate of bounding boxes whose IoU score is larger than a given threshold. Area under curve (AUC) of the overlap success plot is applied to rank the trackers. The DP plot shows the DP for different thresholds. Usually, the DP at 20 pixels is applied to rank the trackers. On the OTB-2015 benchmark, all trackers are evaluated with one-pass evaluation (OPE). The VOT-2016 benchmark is the fourth VOT challenge, which includes 60 sequences. The expected average overlap (EAO), accuracy, robustness, average overlap (AO), and equivalent filter operations (EFO) are used to evaluate trackers on VOT-2016. The main evaluating indicator, EAO, synthetically reflects the overall performance of the trackers.
Our tracker is implemented in Matlab using MatConvNet.
42
SiamFC with three scales is selected as baseline tracker since this version runs faster than the one with five scales and only performs slightly lower. We set the parameters of soft labeling with quasi-Gaussian structure in equation (5) as Table 1. The means of Gaussian distribution are set their values as 1 to satisfy the first constraint in equation (4), which makes the response values of samples belonging to the same class are no longer the same but positively correlated with their IoU value. To make the difference between the response values of different samples belonging to the same class appropriate, we set the values of the standard variances of Gaussian distribution as 0.5 times the response map size. Then the values of the scale factors are set as
Parameter of the soft labeling with quasi-Gaussian structure for SiamFC.
Comparisons with baseline trackers
For a more comprehensive validity evaluation of our proposed soft labeling with quasi-Gaussian structure, we compare the SiamFC-label with its baseline tracker on OTB-2015 and VOT-2016 benchmarks. Note that, SiamFC 11 provides two tracking models, denoted by SiamFC-color and SiamFC-colorgray in this article. The difference between these two trackers is that SiamFC-colorgray converts 25% of the pairs to grayscale in training phase to handle the gray videos. We replace the binary labeling of these two trackers with the proposed soft labeling in the training phase, denoting the variants as SiamFC-label-color and SiamFC-label-colorgray respectively.
For SiamFC-label-color, only its label is different from SiamFC-color while all other hyper-parameters are the same as SiamFC-color. Experiment results shown in Figure 8 indicate that SiamFC-label-color achieves overall 1.8% and 1.9% improvement to SiamFC-color in terms of precision and success metric on OTB-2015 benchmark. What’s more, SiamFC-label-color performs better than SiamFC-colorgray on the precision and success metric, even without the trick for handling the gray videos.

Precision and success plots of SiamFC-label-color, SiamFC-label-colorgray, and the baseline trackers using OPE on the OTB-2015 benchmark. OPE: one-pass evaluation.
To maximize the improvement caused by our proposed soft labeling, we make appropriate adjustments to the hyper-parameters and adapt the trick of handling the gray videos for SiamFC-label-colorgray. (1) Hyper-parameters: As described in the “Soft labeling with quasi-Gaussian structure for deep classification tracking” section, the soft labeling makes the difference between response values for different classes more significant, which is more conducive to classifying samples but slows the convergence process. Thus, compared with training over 50 epochs in SiamFC, 11 we train two more epochs, a total of 52 epochs. For 52 epochs training, the learning rate of the first 50 epochs is decayed geometrically after epoch from 10−2 to 10−5, which is consistent with SiamFC, 11 while the learning rates of the last 2 epochs are 9.3260e−06 and 8.1113e−06, respectively. (2) The trick for handling the gray videos: We adopt the trick of re-training a special gray network with all grayscale pairs in SiamFC-tri 38 instead of the trick in SiamFC 11 to handle the gray videos. For the special gray network, we only convert all pairs to grayscale while the other hyper-parameters in the training phase are all consistent with the color network. As shown in the Figure 8, comparing with SiamFC-color, SiamFC-label-colorgray achieves 3.5% and 2.7% improvement on precision and success metric, respectively. Further, SiamFC-label-colorgray achieves overall 2% and 0.8% improvement of precision and success metric respectively in comparison with SiamFC-colorgray.
In addition, we take SiamFC-label-color as the representation of SiamFC-label to compare with the baseline tracker SiamFC on VOT-2016 benchmark. As shown in the Table 2 and Figure 9, compared to the baseline tracker, SiamFC-label(-color) performs more favorably in terms of EAO, accuracy, robustness, and AO, while operating at almost the same frame-rate with SiamFC (86.3 Fps vs. 86.5 Fps).

EAO plots of SiamFC-label, the baseline tracker, and the state-of-the-art trackers on VOT-2016 benchmark. EAO: expected average overlap.
Overall performance comparison on VOT-2016 benchmark.
EAO: expected average overlap; AO: average overlap; EFO: equivalent filter operations.
The bold values represent the performance of our method.
Comparisons with state-of-the-art trackers
We compare the trackers SiamFC-label-color and SiamFC-label-colorgray with the state-of-the-art trackers using OPE with DP and overlap success metrics as proposed in OTB-2015 benchmark datasets, which mainly includes LCT, 43 KCF, 44 SRDCF, 45 SAMF, 46 DSST, 47 MEEM, 48 and CFNet. 36 As shown in Figure 10, SiamFC-label-colorgray and SiamFC-label-color respectively achieve the first and fourth best DP (79.1% and 77.4%) while the second and third best performance in success metric (59.0% and 58.2%). Although SiamFC-label-colorgray and SiamFC-label-color rank slightly lower than SRDCF in terms of success metrics, their real-time (85.7 Fps and 86.3 Fps) is much faster than SRDCF (5 Fps) as shown in Table 3.

Precision and success plots of SiamFC-label-color, SiamFC-label-colorgray, and the state-of-the-art trackers using OPE on the OTB-2015 benchmark. OPE: one-pass evaluation.
Overall performance on the OTB-2015 in comparison to the state-of-the-art trackers.a
DP: distance precision; AUC: area under curve.
a Red is the best, blue is the second, green is the third. DP indicates the representative DPs at 20 pixels for precision plots while AUC indicates the AUC of success plots
Furtherly, qualitative experiments on VOT-2016 benchmark against the state-of-the-art tracker are performed, which mainly MDNet_N, 9 DPT, 49 SiamFC, 11 deepMKCF, 50 DAT, 51 KCF, 44 SAMF, 46 DSST. 47 As shown in Figure 9 and Table 2, SiamFC-label(-color) behaves comparably with the state-of-the-art tracker in terms of EAO, ranking the second on VOT-2016 benchmark. Especially, SiamFC-label(-color) achieves the best accuracy among all these compared trackers.
Attribute-based performance analysis
Extensive performance analysis on the locating precision and classification ability is presented to further illustrate the effectiveness of the proposed soft labeling. As with the experiments in the “Comparisons with state-of-the-art trackers” section, we select SiamFC-color as the baseline tracker and compare SiamFC-label-color with SiamFC-color and SiamFC-colorgray on the OTB-2015 dataset to rule out other interference factors.
Locating ability improvement: We selected 2, 4, 6, 8, 10 pixels instead of 20 pixels as the threshold of precision metric, and then compared the overall performance of SiamFC-label-color, SiamFC-color, and SiamFC-colorgray. Figure 11 presents the locating precision improvement percentage of SiamFC-label-color in comparison to SiamFC-color and SiamFC-colorgray at different thresholds. Experimental results indicate that the smaller threshold value (i.e. the higher locating precision) is, the larger locating precision improvement percentage SiamFC-label-color achieves. This further illustrates the proposed soft labeling can enhance locating ability of the deep classification trackers.

The precision improvement percentage of SiamFC-label-color in comparison to the baseline trackers at different thresholds.
More specifically, experiments on Car4 sequence are presented in Figure 12 to intuitively demonstrate the location ability improvement caused by the soft labeling. Note that the location error of SiamFC-label-color is less than that of SiamFC-color and SiamFC-colorgray overall. This clearly proves that SiamFC-label-color locates the target more accurately than SiamFC-color and SiamFC-colorgray.

The location errors of SiamFC-label-color in comparison to SiamFC-color (a) and SiamFC-colorgray (b) on Car4 sequence.
Classification ability improvement: Besides the locating ability, the proposed quasi-Gaussian combination soft label can also enhance the classification ability because the important information about the difference among samples in the same class is added in the training phase. Qualitative results on four sequences are presented in Figure 13 where SiamFC-color and SiamFC-colorgray both fail to track when the targets undergoing large appearance changes, whereas SiamFC-label-color can locate them robustly.

Qualitative results comparing SiamFC-color with the baseline trackers on six challenging sequences in the OTB-2015 benchmark (from top to down: Dragonbaby, Box, Girl2, Bolt2, Human4, and Matrix_1, respectively).
Conclusions
In this article, we revisit the binary labeling for deep classification trackers and indicate the problems in binary labeling through theoretical and experimental analysis. To solve such problems, we propose a soft labeling with quasi-Gaussian structure instead of the binary labeling to enhance the classification and locating ability of deep classification tracking, which takes into account the difference among the samples of the same and different classes simultaneously. To verify the effectiveness of our proposed soft labeling, we apply it to improve the deep classification tracker SiamFC, and then compare the variant with its baseline tracker and the state-of-the-art trackers on OTB-2015 and VOT benchmark datasets. Further, we present extensive attribute-based performance analysis to further illustrate the validity of our proposed soft labeling. More than SiamFC, our proposed soft labeling with quasi-Gaussian structure works on other deep classification tracking algorithms, which is our further work. Moreover, in various real-world applications such as robots, unmanned surface vessel (USV), and so on, our proposed method can achieve more precise and robust tracking performance.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This project is supported by the Key Projects of the National Natural Science Foundation of China (No. 91648119), the National Nature Science Foundation of China (No. 61673254), and the National Nature Science Foundation of China (No. U1613226). The authors also gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation.
Appendix 1
The logistic loss function L about
We observed that this loss function has the following two significant characteristics: The first derivative about ti
is always less than 0, that is, The second derivative about ti
is always greater than 0, that is,
The function of the gradient descent is expressed as following
where
Further, since the first derivative about ti is monotonically increasing with respect to ti and is always less than 0, then
Thus, the absolute value of the first derivative about
Equation (1F) indicates that ti
will gradually converge to a constant c until
What’s more, to further validate this theoretical derivation, we conduct the experiments on convergences of gradient descent for different initial values. As the Figure 1A shows, ti
will converge to a constant
