Distributed matrix completion for large-scale multi-label classification

Abstract

Large-scale multi-label classification has always been of great interest for researchers. The difficulty with such problems is the huge amount of data that should be processed, possibly in multiple paths. This amount of data does not fit in the memory of a single computer and that is the bottle-neck for many large-scale applications. On the other hand, matrix completion is a great tool for many applications, including classification. It is a great tool for modeling the data and finding the outliers and noises within the data. In this paper, we develop a distributed matrix completion method for multi-label classification. To do this, we first propose a simple distributed algorithm for minimizing the nuclear norm of a matrix to recover its low-rank representation, which is then generalized for the classification problem. Several synthetic and real datasets are used to verify both the distributed nuclear norm minimization and the distributed matrix completion approach. The results indicate that the proposed algorithm outperforms state-of-the-art methods for large-scale classification.

Keywords

Matrix completion multi-label classification distributed optimization alternating direction method convex optimization

Get full access to this article

View all access options for this article.