Fast Optimistic Gradient Descent Ascent (OGDA) Method in Continuous and Discrete Time
Foundations of Computational Mathematics
https://doi.org/10.1007/s10208-023-09636-5
Fast Optimistic Gradient Descent Ascent (OGDA) Method in
Continuous and Discrete Time
Radu Ioan Boţ1 · Ernö Robert Csetnek1 · Dang-Khoa Nguyen1,2
Received: 25 March 2022 / Revised: 16 June 2023 / Accepted: 11 September 2023
© The Author(s) 2023
Abstract
In the framework of real Hilbert spaces, we study continuous in time dynamics as well
as numerical algorithms for the problem of approaching the set of zeros of a singlevalued monotone and continuous operator V . The starting point of our investigations
is a second-order dynamical system that combines a vanishing damping term with
the time derivative of V along the trajectory, which can be seen as an analogous of
the Hessian-driven damping in case the operator isoriginating
from a potential. Our
1
method exhibits fast convergence rates of order o tβ(t) for V (z(t)), where z(·)
denotes the generated trajectory and β(·) is a positive nondecreasing function satisfying a growth condition, and also for the restricted gap function, which is a measure
of optimality for variational inequalities. We also prove the weak convergence of the
trajectory to a zero of V . Temporal discretizations of the dynamical system generate implicit and explicit numerical algorithms, which can be both seen as accelerated
versions of the Optimistic Gradient Descent Ascent (OGDA) method for monotone
operators, for which we prove that the generated sequence of iterates (z k )k≥0 shares
Communicated by Jérôme Bolte.
Radu Ioan Boţ: Research partially supported by FWF (Austrian Science Fund), Projects W 1260 and P
34922-N. Ernö Robert Csetnek: Research partially supported by FWF (Austrian Science Fund), Project P
29809-N32. Dang-Khoa Nguyen: Research supported by FWF (Austrian Science Fund), Project P
34922-N.
B Radu Ioan Boţ
Ernö Robert Csetnek
Dang-Khoa Nguyen
1
Faculty of Mathematics, University of Vienna, Oskar-Morgenstern-Platz 1, 1090 Vienna, Austria
2
Faculty of Mathematics and Computer Science, University of Science, Vietnam National University,
Ho Chi Minh City 700000, Vietnam
123
Foundations of Computational Mathematics
the asymptotic features of the continuous dynamics. In particular
we show for the
1
implicit numerical algorithm convergence rates of order o kβk for V (z k ) and the
restricted gap function, where (βk )k≥0 is a positive nondecreasing sequence satisfying
a growth condition. For the explicit numerical algorithm, we show by additionally
assuming that the operator V is Lipschitz continuous convergence rates of order o k1
for V (z k ) and the restricted gap function. All convergence rate statements are last
iterate convergence results; in addition to these, we prove for both algorithms the
convergence of the iterates to a zero of V . To our knowledge, our study exhibits the
best-known convergence rate results for monotone equations. Numerical experiments
indicate the overwhelming superiority of our explicit numerical algorithm over other
methods designed to solve monotone equations governed by monotone and Lipschitz
continuous operators.
Keywords Monotone equation · Variational inequality · Optimistic Gradient Descent
Ascent (OGDA) method · Extragradient method · Nesterov’s accelerated gradient
method · Lyapunov analysis · Convergence rates · Convergence of trajectories ·
Convergence of iterates
Mathematics Subject Classification 47J20 · 47H05 · 65K10 · 65K15 · 65Y20 ·
90C30 · 90C52
1 Introduction
Let H be a real Hilbert space and V : H → H a monotone and continuous operator.
We are interested in developing fast converging methods aimed to find a zero of V , or
in other words, to solve the monotone equation
V (z) = 0,
(1)
for which assume that it has a nonempty solution set Z. The monotonicity and the
continuity of V imply that z ∗ is a solution of 1 if and only if it is a solution of the
following variational inequality
z − z ∗ , V (z) ≥ 0 ∀z ∈ H.
(2)
One of the main motivations to study 1 comes from minimax problems. More precisely,
consider the problem
min max (x, y) ,
(3)
x∈X y∈Y
where X and Y are real Hilbert spaces and : X × Y → R is a continuously
differentiable and convex–concave function, i.e., (·, y) is convex for every y ∈ Y and
(x, ·) is convex for every x ∈ X . A solution of 3 is a saddle point (x∗ , y∗ ) ∈ X × Y
of , which means that it fulfills
(x∗ , y) ≤ (x∗ , y∗ ) ≤ (x, y∗ ) ∀ (x, y) ∈ X × Y
123
Foundations of Computational Mathematics
or, equivalently,
∇x (x∗ , y∗ )
=0
−∇ y (x∗ , y∗ ) = 0.
(4)
Taking into account that the mapping
(x, y) → ∇x (x, y) , −∇ y (x, y)
(5)
is monotone [43], it means that the problem of finding a saddle point of eventually
brings us back to the problem 1.
Both 1 and 3 are fundamental models in various fields such as optimization, economics, game theory and partial differential equations. They have recently regained
significant attention, in particular in the machine learning and data science community, due to the fundamental role they play, for instance, in multi-agent reinforcement
learning [37], robust adversarial learning [32] and generative adversarial networks
(GANs) [18, 24].
In this paper, we develop fast continuous in time dynamics as well as numerical
algorithms for solving 1 and investigate their asymptotic/convergence properties. First
we formulate a second-order dynamical system that combines a vanishing damping
term with the time derivative of V along the trajectory, which can be seen as an analogous of the Hessian-driven damping in case the operator is originating from a potential.
A continuously differentiable and nondecreasing function β : [t0 , +∞) → (0, +∞),
which appears in the system, plays an important role in the analysis. If β satisfies a
specific growth condition, which is for instance satisfied by polynomials including
1
for
constant functions, then the method exhibits convergence rates of order o tβ(t)
V (z(t)), where z(t) denotes the generated trajectory, and for the restricted gap function associated with 2. In addition, z(t) converges asymptotically weakly to a solution
of 1.
By considering a temporal discretization of the dynamical system, we obtain
an
1
implicit numerical algorithm which exhibits convergence rates of order o kβk for
V (z k ) and the restricted gap function associated with 2, where (βk )k≥0 is a nondecreasing sequence and (z k )k≥0 is the generated sequence of iterates. For the latter, we
also prove that it converges weakly to a solution of 1.
By a further more involved discretization of the dynamical system, we obtain an
explicit numerical algorithm, which, under the additional assumption
that V is Lip
schitz continuous, exhibits convergence rates of order o k1 for V (z k ) and the
restricted gap function associated with 2, where (z k )k≥0 is the generated sequence
of iterates, which is also to converge weakly to a solution of 1.
The resulting numerical schem (...truncated)