Proximal Envelopes

June 11, 2018, 9:40 AM - 10:10 AM



Rutgers University

CoRE Building

96 Frelinghuysen Road

Piscataway, NJ 08854

Click here for map.

Panos Patrinos, Katholieke Universiteit Leuven

Proximal algorithms are ubiquitous tools in optimization, for they can decompose a complex minimization problem into simpler subproblems that scale gracefully with the problem size. This class of algorithms includes, yet is not limited to, the proximal gradient method, Douglas-Rachford splitting and ADMM, all extensively used for a plethora of applications in engineering and computer sciences such as control, signal processing, image analysis, machine learning, and many more. Splitting algorithms are also particularly suited for embedded applications, due to the simplicity of their operations and amenability to parallelization. The underlying mechanisms ensuring convergence of such methods are well understood when applied to convex problems, where the theory hinges on fixed-point and monotone operator theory, Fejér monotonicity, duality, and so on.

Recently, there has been an ever increasing interest in addressing large-scale, structured nonconvex problems. However, only partial convergence results for proximal algorithms  are available in the nonconvex setting and overall the theory lacks a unifying framework. Furthermore, despite the cheapness of each iteration, even in the simpler convex case the high sensitivity to ill-conditioning of the problem oftentimes results in prohibitively slow convergence rates.

In response to these issues we introduce the "proximal envelopes", a generalization of the Moreau envelope and of its connections with proximal point iterations. Proximal envelopes are continuous, exact, real-valued “Lyapunov" functions for the original problem with a two-fold value. One one hand, they serve as a theoretical tool for ensuring convergence of splitting algorithms. On the other hand, they can be used to robustify and greatly accelerate their convergence without any additional operation other than scalar products. The resulting method is globally convergent and preserves the simplicity of the original splitting algorithm. This is achieved by a suitable line-search strategy along (limited-memory) BFGS directions, for instance. 


Slides     Video