An Adaptive Alternating Direction Method of Multipliers (pdf)

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1007/s10957-022-02098-9.pdf

An Adaptive Alternating Direction Method of Multipliers

Journal of Optimization Theory and Applications https://doi.org/10.1007/s10957-022-02098-9 An Adaptive Alternating Direction Method of Multipliers Sedi Bartz1 · Rubén Campoy2 · Hung M. Phan1 Received: 31 December 2021 / Accepted: 19 August 2022 © The Author(s) 2022 Abstract The alternating direction method of multipliers (ADMM) is a powerful splitting algorithm for linearly constrained convex optimization problems. In view of its popularity and applicability, a growing attention is drawn toward the ADMM in nonconvex settings. Recent studies of minimization problems for nonconvex functions include various combinations of assumptions on the objective function including, in particular, a Lipschitz gradient assumption. We consider the case where the objective is the sum of a strongly convex function and a weakly convex function. To this end, we present and study an adaptive version of the ADMM which incorporates generalized notions of convexity and penalty parameters adapted to the convexity constants of the functions. We prove convergence of the scheme under natural assumptions. To this end, we employ the recent adaptive Douglas–Rachford algorithm by revisiting the well-known duality relation between the classical ADMM and the Douglas–Rachford splitting algorithm, generalizing this connection to our setting. We illustrate our approach by relating and comparing to alternatives, and by numerical experiments on a signal denoising problem. Keywords Alternating direction method of multipliers · Douglas–Rachford algorithm · Weakly convex function · Comonotonicity · Signal denoising · Firm thresholding Communicated by Heinz Bauschke. B Rubén Campoy Sedi Bartz Hung M. Phan 1 Department of Mathematical Sciences, Kennedy College of Sciences, University of Massachusetts Lowell, Lowell, MA, USA 2 Department of Statistics and Operational Research, Universitat de València, Valencia, Spain 123 Journal of Optimization Theory and Applications Mathematics Subject Classification 47H05 · 47N10 · 47J25 · 49M27 · 65K15 1 Introduction By now, the alternating direction method of multipliers (ADMM) is a well-studied and applied splitting algorithm. In particular, it is applied to the problem min f (x) + g(z) s.t. M x = z, x ∈ Rn , z ∈ Rm ; (P) where f : Rn → ] − ∞, +∞] and g : Rm → ] − ∞, +∞] are proper, lower semicontinuous and convex functions, and M ∈ Rm×n . The ADMM can be traced back to 1975 in the studies of Glowinski and Marroco [26], and of Gabay and Mercier [23]. It was revisited in the early 1980s in [21, 22]. The ADMM has been successfully applied to a wide range of statistical and learning problems such as sparse regression, signal and image processing, and support vector machines, to name a few. An extensive survey on the ADMM and its applications can be found in [10]. The ADMM can be viewed as an enhanced version of the method of multipliers in the case where the objective function is separable. The augmented Lagrangian associated with (P) is the function L γ (x, z, y) = f (x) + g(z) + y, M x − z + γ M x − z2 , 2 (1) where γ ≥ 0 is the penalty parameter and y ∈ Rm is the Lagrange multiplier. By employing the method of multipliers, one solves (P) by iteratively minimizing L γ (x, z, y) over the (primal) variables x and z while updating the Lagrange multiplier y (the dual variable). However, this requires to minimize the Lagrangian jointly in x and z. In order to avoid this situation, the ADMM takes advantage of the separability of the objective function and splits the minimization procedure into two separate steps, one for each variable. Specifically, by fixing a positive penalty parameter γ , the iterative step of the ADDM for solving (P) is x k+1 = argmin L γ (x, z k , y k ), (2a) z k+1 = argmin L γ (x k+1 , z, y k ), (2b) y k+1 = y + γ (M x k+1 − z k+1 ). (2c) x∈Rn z∈Rm k Convergence of this scheme is well established in the case where f and g are convex, see, e.g., [10, § 3.2]. In nonconvex cases, it has been studied, e.g., [28, 30, 41, 42], under various combinations of assumptions which include, in particular, a Lipschitz continuity assumption on the gradient of f and/or g. In the present study, we consider the case where f is strongly convex and g is weakly convex such that the objective of (P) is convex on the constraint. We intro- 123 Journal of Optimization Theory and Applications duce an adaptive alternating direction method of multipliers (aADMM) for which we incorporate a flexible range of penalty parameters adapted to the convexity constants of the functions f and g. To this end, we revisit the well-known relation between the classical ADMM and the Douglas–Rachford (DR) splitting algorithm [18, 31]. This duality relation was first observed in [22, § 5.1] and later revisited by other authors, see, e.g., [1, Appendix A] or [6, Remark 3.14]. A more detailed discussion regarding the ADMM is available in [20] while [34] is a recent survey on equivalences and other relations between splitting algorithms. We provide an analogous relation between our aADMM and the recent adaptive Douglas–Rachford (aDR) algorithm [3, 16]. We then employ this relation in order to derive convergence of our aADMM from the convergence of the aDR. We point out (see Remark 4.1) that in our strongly-weakly convex setting, the functions f and g in problem (P) can be augmented into convex functions which transform the problem into a convex one, admissible for the classical ADMM, with the same minimizers, optimal values and computational difficulty level. However, the ADMM for the augmented problem corresponds to a Douglas–Rachford algorithm which is not in direct duality relations with the original strongly-weakly problem. Consequently, this theoretical aspect is lacking. Instead, we preserve and analyze the original problem. Our approach does yield a natural duality relation with a corresponding aDR algorithm which is instrumental in our convergence analysis. An additional benefit of our approach is that we relax and improve previously imposed assumptions on the strongly-weakly convex scenario such as in [33, 44] (see Remarks 5.1 and 5.2). Finally, although augmentation is a viable option in the strongly-weakly convex setting, application of the adaptive algorithms to the original problem has its own merit and by now was studied in a number of recent publications such as [2, 3, 16, 17, 25, 27, 32, 44, 45], to name a few. Our main result is Theorem 5.1, where we provide convergence of the aADMM. Moreover, in order to show how our framework generalizes and relaxes the framework of the classical convex ADMM, we incorporate in our convergence analysis the relaxed assumptions regarding the convexity of f and g while not imposing further on top of the traditional constraint qualifications of the ADMM. To this end, we revisit and incorporate in our analysis some of the most commonly imposed assumptions and conditions on the classical ADMM. For the sake of accessibility (...truncated)