Hybrid ADMM: a unifying and fast approach to decentralized optimization
Ma et al. EURASIP Journal on Advances in Signal
(2018) 2018:73
Processing
https://doi.org/10.1186/s13634-018-0589-x
EURASIP Journal on Advances
in Signal Processing
RESEARCH
Open Access
Hybrid ADMM: a unifying and fast
approach to decentralized optimization
Meng Ma1 , Athanasios N. Nikolakopoulos2 and Georgios B. Giannakis1,2*
Abstract
The present work introduces the hybrid consensus alternating direction method of multipliers (H-CADMM), a novel
framework for optimization over networks which unifies existing distributed optimization approaches, including the
centralized and the decentralized consensus ADMM. H-CADMM provides a flexible tool that leverages the underlying
graph topology in order to achieve a desirable sweet spot between node-to-node communication overhead and rate
of convergence—thereby alleviating known limitations of both C-CADMM and D-CADMM. A rigorous analysis of the
novel method establishes linear convergence rate and also guides the choice of parameters to optimize this rate. The
novel hybrid update rules of H-CADMM lend themselves to “in-network acceleration” that is shown to effect
considerable—and essentially “free-of-charge”—performance boost over the fully decentralized ADMM.
Comprehensive numerical tests validate the analysis and showcase the potential of the method in tackling efficiently,
widely useful learning tasks.
Keywords: ADMM, Distributed optimization, Decentralized learning, Hybrid, Consensus
1 Introduction
Recent advances in machine learning, signal processing,
and data mining have led to important problems that
can be formulated as distributed optimization over networks. Such problems entail parallel processing of data
acquired by interconnected nodes and arise frequently
in several applications, including data fusion and processing using sensor networks [1–4], vehicle coordination
[5, 6], power state estimation [7], clustering [8], classification [9], regression [10], filtering [11], and demodulation
[12, 13], to name a few. Among the candidate solvers
for such problems, the alternating direction method of
multipliers (ADMM) [14, 15] stands out as an efficient
and easily implementable algorithm of choice that has
attracted much interest in recent years [16–19], thanks to
its simplicity, fast convergence, and easily decomposable
structure.
Many distributed optimization problems can be formulated in a consensus form and solved efficiently by
ADMM [15, 20]. The solver involves two basic steps:
*Correspondence:
Department of Electrical and Computer Engineering, University of Minnesota,
200 Union Street SE, 55455 Minneapolis, USA
2
Digital Technology Center, University of Minnesota, 117 Pleasant ST, 55455
Minneapolis, USA
1
(i) a communication step for exchanging information
with a central processing unit, the so-called fusion center (FC), and (ii) an update step for updating the local
variables at each node. By alternating between the two,
local iterates eventually converge to the global solution.
This approach is referred to as centralized consensus
ADMM (C-CADMM), and although it has been successfully applied in various settings, it may not always present
the preferable solver. In large-scale systems for instance,
the cost of connecting each node to the FC may become
prohibitive as the overhead of communicating data to
the FC may be overwhelming and the related storage
requirement could surpass the capacity of a single FC. Furthermore, having one dedicated FC can lead to a single
point of failure. In addition, there might be privacy-related
issues that restrict access to private data.
Decentralized optimization, on the other hand, forgoes with the FC by exchanging information only among
single-hop neighbors. As long as the network is connected, local iterates can consent to the globally optimal
decision variable, thanks to the aforementioned information exchange. This method—referred to as decentralized
consensus ADMM (D-CADMM)—has attracted considerable interest; see e.g., [20] for a review of applications in
communications and networking. In large-scale networks,
© The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and
reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license, and indicate if changes were made.
Ma et al. EURASIP Journal on Advances in Signal Processing
(2018) 2018:73
D-CADMM’s convergence slows down as the per-node
information experiences large delays to reach remote destinations through multiple neighbor-to-neighbor communications.
1.1 Our contributions
To address the aforementioned limitations, the present
paper puts forth a novel decentralized framework, that we
term hybrid consensus ADMM (H-CADMM), which unifies and markedly broadens C-CADMM and D-CADMM.
Our contributions are in five directions:
(i) H-CADMM features hybrid updates
accommodating communications with both the FCs
and single-hop neighbors, thus bridging centralized
with fully decentralized updates. This makes
H-CADMM appealing for large-scale networks with
multiple local FCs—a situation none of the existing
approached is designed to handle.
(ii) A novel formulation of D-CADMM without
duplicate constraints (dual variables commonly
adopted by decentralized learning [7, 20, 21])
emerges simply by specializing the hybrid constraints
to coincide with those arising from the purely
neighborhood-based formulation.
(iii) Linear convergence is established, along with a rate
bound and specializes to C- CADMM and
D-CADMM. The parameter setting to achieve the
optimal bound is also provided.
(iv) H-CADMM is flexible to deploy FCs as needed to
maximize performance gains, thus striking a
desirable trade-off between the number of FCs
deployed and convergence gain sought.
(v) The capability of handling hybrid constraints not
only deals with mixed updates but also effects
“in-network acceleration” in decentralized operation
without incurring noticeable increase in the overall
complexity.
Page 2 of 17
programs is established in [35]; see also [18] where the
cost is a sum of component costs. Global linear convergence of a more general form of ADMM is reported in
[19], and linear convergence for a generalized formulation
of consensus ADMM using the so-called “communication
matrix” in [31].
Though D-CADMM has been applied to various problems [3, 4, 10, 12, 13, 36], its linear convergence remained
open until recently [21] (see [37] for the weighted counterpart). A successive orthogonal projection approach for
distributed learning over networked nodes is introduced
in [38], where nodes cannot communicate, but each node
can access only limited amounts of data, and agreement
is enforced across nodes sharing the same data. A distributed ADMM algorithm that deals with node clusters
was pr (...truncated)