Abstract
In order to improve the performance of semi-supervised learning, a kind of safe semi-supervised classification algorithm based active learning sampling strategy is proposed. First, an active learning sampling method based on uncertainty and representativenes is designed. The weighted algorithm combining the uncertainty and representativenesss is used to select the unlabeled samples with rich information and representation, providing for semi-supervised learning. Second, a method of label prediction based on grouping verification is designed. Prelabeling is executed on unlabeled sample selected by active learning. The sample with pseudo-label is added into the labeled sample set to carry out grouping, training and testing. The corresponding errors of various pseudo-labels are calculated and the pseudo-label making the accuracy least is selected as the candidate label of the unlabeled sample. Third, a method of security verification is designed. Only the label making the accuracy lower than before is selected as the final label of the unlabeled sample to expand the number of labeled samples. Iterations are repeatedly executed until a certain precision is met. Finally, the classifier is trained using the final labeled set. The experiments are carried out on semi-supervised datasets and UCI datasets, and the results show that the proposed algorithms are effective.
Get full access to this article
View all access options for this article.
