Abstract
Recent advances in deep learning have drawn significant attention to the problem of deep clustering. This paper presents a deep neural model, namely SCDC (Soft-Center loss for Deep Clustering), whose design is based on an autoencoder architecture that mainly focuses on data clustering. To this end, a novel loss function, also known as soft-center loss, is presented to drive the training process. The new objective is closely related to the K-means loss function and helps promote clustering-specific features in the representation space. Additionally, the model is regularized using reconstruction loss and enhanced with a clustering-oriented loss. Furthermore, our investigation is linked to the problem of feature quantization and representation with the target of efficient support for the task of approximated nearest neighbor (ANN) search. To achieve this, we have presented the general pipeline, including model training, codebook generation, feature quantization, and searching. Notably, we have conducted extensive visual analytics on the learned representations and compact codebooks to assess the discrimination capability of the proposed model. Experimental results showed that SCDC is competitive with many modern clustering models on several benchmark datasets and delivers high-quality coding for ANN search.
Get full access to this article
View all access options for this article.
