An Adaptive Alternating Direction Method of Multipliers
Journal of Optimization Theory and Applications
https://doi.org/10.1007/s10957-022-02098-9
An Adaptive Alternating Direction Method of Multipliers
Sedi Bartz1 · Rubén Campoy2
· Hung M. Phan1
Received: 31 December 2021 / Accepted: 19 August 2022
© The Author(s) 2022
Abstract
The alternating direction method of multipliers (ADMM) is a powerful splitting algorithm for linearly constrained convex optimization problems. In view of its popularity
and applicability, a growing attention is drawn toward the ADMM in nonconvex
settings. Recent studies of minimization problems for nonconvex functions include
various combinations of assumptions on the objective function including, in particular,
a Lipschitz gradient assumption. We consider the case where the objective is the sum
of a strongly convex function and a weakly convex function. To this end, we present
and study an adaptive version of the ADMM which incorporates generalized notions
of convexity and penalty parameters adapted to the convexity constants of the functions. We prove convergence of the scheme under natural assumptions. To this end, we
employ the recent adaptive Douglas–Rachford algorithm by revisiting the well-known
duality relation between the classical ADMM and the Douglas–Rachford splitting
algorithm, generalizing this connection to our setting. We illustrate our approach by
relating and comparing to alternatives, and by numerical experiments on a signal
denoising problem.
Keywords Alternating direction method of multipliers · Douglas–Rachford
algorithm · Weakly convex function · Comonotonicity · Signal denoising · Firm
thresholding
Communicated by Heinz Bauschke.
B Rubén Campoy
Sedi Bartz
Hung M. Phan
1
Department of Mathematical Sciences, Kennedy College of Sciences, University of
Massachusetts Lowell, Lowell, MA, USA
2
Department of Statistics and Operational Research, Universitat de València, Valencia, Spain
123
Journal of Optimization Theory and Applications
Mathematics Subject Classification 47H05 · 47N10 · 47J25 · 49M27 · 65K15
1 Introduction
By now, the alternating direction method of multipliers (ADMM) is a well-studied and
applied splitting algorithm. In particular, it is applied to the problem
min f (x) + g(z)
s.t. M x = z,
x ∈ Rn , z ∈ Rm ;
(P)
where f : Rn → ] − ∞, +∞] and g : Rm → ] − ∞, +∞] are proper, lower
semicontinuous and convex functions, and M ∈ Rm×n . The ADMM can be traced
back to 1975 in the studies of Glowinski and Marroco [26], and of Gabay and Mercier
[23]. It was revisited in the early 1980s in [21, 22]. The ADMM has been successfully
applied to a wide range of statistical and learning problems such as sparse regression,
signal and image processing, and support vector machines, to name a few. An extensive
survey on the ADMM and its applications can be found in [10].
The ADMM can be viewed as an enhanced version of the method of multipliers
in the case where the objective function is separable. The augmented Lagrangian
associated with (P) is the function
L γ (x, z, y) = f (x) + g(z) + y, M x − z +
γ
M x − z2 ,
2
(1)
where γ ≥ 0 is the penalty parameter and y ∈ Rm is the Lagrange multiplier.
By employing the method of multipliers, one solves (P) by iteratively minimizing
L γ (x, z, y) over the (primal) variables x and z while updating the Lagrange multiplier
y (the dual variable). However, this requires to minimize the Lagrangian jointly in x
and z. In order to avoid this situation, the ADMM takes advantage of the separability
of the objective function and splits the minimization procedure into two separate
steps, one for each variable. Specifically, by fixing a positive penalty parameter γ , the
iterative step of the ADDM for solving (P) is
x k+1 = argmin L γ (x, z k , y k ),
(2a)
z k+1 = argmin L γ (x k+1 , z, y k ),
(2b)
y k+1 = y + γ (M x k+1 − z k+1 ).
(2c)
x∈Rn
z∈Rm
k
Convergence of this scheme is well established in the case where f and g are convex,
see, e.g., [10, § 3.2]. In nonconvex cases, it has been studied, e.g., [28, 30, 41, 42],
under various combinations of assumptions which include, in particular, a Lipschitz
continuity assumption on the gradient of f and/or g.
In the present study, we consider the case where f is strongly convex and g is
weakly convex such that the objective of (P) is convex on the constraint. We intro-
123
Journal of Optimization Theory and Applications
duce an adaptive alternating direction method of multipliers (aADMM) for which we
incorporate a flexible range of penalty parameters adapted to the convexity constants
of the functions f and g. To this end, we revisit the well-known relation between the
classical ADMM and the Douglas–Rachford (DR) splitting algorithm [18, 31]. This
duality relation was first observed in [22, § 5.1] and later revisited by other authors,
see, e.g., [1, Appendix A] or [6, Remark 3.14]. A more detailed discussion regarding
the ADMM is available in [20] while [34] is a recent survey on equivalences and other
relations between splitting algorithms. We provide an analogous relation between our
aADMM and the recent adaptive Douglas–Rachford (aDR) algorithm [3, 16]. We
then employ this relation in order to derive convergence of our aADMM from the
convergence of the aDR.
We point out (see Remark 4.1) that in our strongly-weakly convex setting, the functions f and g in problem (P) can be augmented into convex functions which transform
the problem into a convex one, admissible for the classical ADMM, with the same
minimizers, optimal values and computational difficulty level. However, the ADMM
for the augmented problem corresponds to a Douglas–Rachford algorithm which is not
in direct duality relations with the original strongly-weakly problem. Consequently,
this theoretical aspect is lacking. Instead, we preserve and analyze the original problem. Our approach does yield a natural duality relation with a corresponding aDR
algorithm which is instrumental in our convergence analysis. An additional benefit
of our approach is that we relax and improve previously imposed assumptions on
the strongly-weakly convex scenario such as in [33, 44] (see Remarks 5.1 and 5.2).
Finally, although augmentation is a viable option in the strongly-weakly convex setting, application of the adaptive algorithms to the original problem has its own merit
and by now was studied in a number of recent publications such as [2, 3, 16, 17, 25,
27, 32, 44, 45], to name a few.
Our main result is Theorem 5.1, where we provide convergence of the aADMM.
Moreover, in order to show how our framework generalizes and relaxes the framework
of the classical convex ADMM, we incorporate in our convergence analysis the relaxed
assumptions regarding the convexity of f and g while not imposing further on top
of the traditional constraint qualifications of the ADMM. To this end, we revisit and
incorporate in our analysis some of the most commonly imposed assumptions and
conditions on the classical ADMM. For the sake of accessibility (...truncated)