Accelerated first-order optimization methods using inertia and error bounds

by Patrick Royce Johnstone

Institution: University of Illinois Urbana-Champaign
Year: 2017
Keywords: Optimization; Convergence analysis; Accelerated first-order methods; Subgradient methods
Posted: 02/01/2018
Record ID: 2155164
Full text PDF: http://hdl.handle.net/2142/97298


Optimization is an important discipline of applied mathematics with far-reaching applications. Optimization algorithms often form the backbone of practical systems in machine learning, image processing, signal processing, computer vision, data analysis, and statistics. In an age of massive data sets and huge numbers of variables, a deep understanding of optimization is a necessary condition for developing scalable, computationally inexpensive, and reliable algorithms. In this thesis we design and analyze efficient algorithms for solving the large-scale nonsmooth optimization problems arising in modern signal processing and machine learning applications. The focus is on first-order methods which have low per-iteration complexity and can exploit problem structure to a high degree. First-order methods have the capacity to address large-scale problems for which all alternative methods fail. However, first-order methods can take many iterations to reach the desired accuracy. This has led optimization researchers to ask the following question: is it possible to improve the convergence rate of first-order methods without jeopardizing their low per-iteration complexity? In this thesis, we address this question in three areas. Firstly we investigate the use of inertia to accelerate the convergence of proximal gradient methods for convex composite optimization problems. We pay special attention to the famous lasso problem for which we develop an improved version of the well-known Fast Iterative Soft-Thresholding Algorithm. Secondly we investigate the use of inertia for nonconvex composite problems, making use of the Kurdukya-Lojaziewicz inequality in our analysis. Finally, when the objective function satisfies an error bound which is fairly common in practice, we develop stepsize selections for the subgradient method which significantly outperform the classical approach. The overarching message of this thesis is the following: with careful analysis and design, the convergence rate of first-order methods can be significantly improved.Advisors/Committee Members: Moulin, Pierre (advisor), Moulin, Pierre (Committee Chair), Bresler, Yoram (committee member), He, Niao (committee member), Nedich, Angelia (committee member), Srikant, Rayadurgam (committee member).