Abstract
Non-convex optimization problems with multiple local optima are frequently encountered in machine learning. Graduated optimization algorithm (GOA) is a popular method for obtaining the global optima of non-convex problems by minimizing a sequence of locally strong convex functions that smooth the original non-convex problem with increasing approximation. Recently, GradOpt, a GOA-based algorithm, has demonstrated remarkable theoretical and experimental results. However, to optimize problems consisting of both convex and non-convex parts, GradOpt considers the entire objective function as a single non-convex function, leading to significant gaps between the smoothed and original functions. In this study, we propose two new algorithms: SVRG-GOA and PSVRG-GOA. They gradually smooth the non-convex part of the problem and then minimize the smoothed function using either the stochastic variance reduced gradient (SVRG) or proximal SVRG (Prox-SVRG) method. Both the algorithms are proven to have lower iteration complexity (O (1/ɛ)) than GradOpt (O (1/ɛ2)). Some tricks, such as larger shrinkage factor, projection step, stochastic gradient, and mini-batch skill are also applied to accelerate convergence of the proposed algorithms. Experimental results illustrate that two new algorithms with similar performance can converge to the "global" optima of a non-convex problem comparatively faster than GradOpt or non-convex Prox-SVRG.
Keywords
Get full access to this article
View all access options for this article.
