Lattice Agreement in Message Passing Systems
D I S C
Lattice Agreement in Message Passing Systems
Vijay K. Garg 0 1 2
0 Xiong Zheng University of Texas at Austin , Austin, TX 78712 , USA
1 University of Texas at Austin , Austin, TX 78712 , USA
2 Changyong Hu University of Texas at Austin , Austin, TX 78712 , USA
This paper studies the lattice agreement problem and the generalized lattice agreement problem in distributed message passing systems. In the lattice agreement problem, given input values from a lattice, processes have to nontrivially decide output values that lie on a chain. We consider the lattice agreement problem in both synchronous and asynchronous systems. For synchronous lattice agreement, we present two algorithms which run in log(f ) and min{O(log2h(L)), O(log2f )} rounds, respectively, where h(L) denotes the height of the input sublattice L, f < n is the number of crash failures the system can tolerate, and n is the number of processes in the system. These algorithms have significant better round complexity than previously known algorithms. The algorithm by Attiya et al. [Attiya et al. DISC, 1995] takes log(n) synchronous rounds, and the algorithm by Mavronicolasa [Mavronicolasa, 2018] takes min{O(h(L)), O(sqrt(f ))} rounds. For asynchronous lattice agreement, we propose an algorithm which has time complexity of 2 ? min{h(L), f + 1} message delays which improves on the previously known time complexity of O(n) message delays. The generalized lattice agreement problem defined by Faleiro et al in [Faleiro et al. PODC, 2012] is a generalization of the lattice agreement problem where it is applied for the replicated state machine. We propose an algorithm which guarantees liveness when a majority of the processes are correct in asynchronous systems. Our algorithm requires min{O(h(L)), O(f )} units of time in the worst case which is better than O(n) units of time required by the algorithm in [Faleiro et al. PODC, 2012]. 2012 ACM Subject Classification Theory of computation ? Distributed algorithms Related Version A full version of the paper is available at https://arxiv.org/abs/1807. 11557. Funding Supported by NSF CNS1812349, NSF CNS1563544, NSF CNS1346245, Huawei Inc., and the Cullen Trust for Higher Education Endowed Professorship. Acknowledgements We want to thank John Kaippallimalil for providing some useful application cases for CRDT and generalized lattice agreement.
and phrases Lattice Agreement; Replicated State Machine; Consensus

1
Introduction
Lattice agreement, introduced in [2] to solve the atomic snapshot problem [1] in shared
memory, is an important decision problem in distributed systems. In this problem, processes
start with input values from a lattice and need to decide values which are comparable to
each other. Lattice agreement problem is a weaker decision problem than consensus. In
synchronous systems, consensus cannot be solved in fewer than f + 1 rounds [6], but lattice
agreement can be solved in log f rounds (shown by an algorithm we propose). In asynchronous
systems, the consensus problem cannot be solved even with one failure [8], whereas the lattice
agreement problem can be solved in asynchronous systems when a majority of processes is
correct [7].
In synchronous message passing systems, a log n rounds recursive algorithm based on
?branchandbound? approach is proposed in [2] to solve the lattice agreement problem with
message complexity of O(n2). It can tolerate at most n ? 1 process failures. Later, [12]
gave an algorithm with round complexity of min{1 + h(L), b(3 + ?8f + 1/2)c}, for any
execution where at most f < n processes may crash. Their algorithm has the earlystopping
property and is the first algorithm with round complexity that depends on the actual height
of the input lattice. Our first algorithm, for synchronous lattice agreement, LA?, requires
log h(L) rounds. It assumes that the height of the input lattice is known to all processes.
By applying this algorithm as a building block, we give an algorithm, LA?, which requires
only log f rounds without the height assumption in LA?. Instead of directly trying to
decide on the comparable output values which are from a lattice with an unknown height,
this algorithm first performs lattice agreement on the failure set known by each process by
using LA?. Then each process removes values from faulty processes they know and outputs
the join of all the remaining values. Our third algorithm, LA?, has round complexity of
min{O(log2 h(L)), O(log2 f )), which depends on the height of the input lattice but does not
assume that the height is known. This algorithm iteratively guesses the actual height of the
input lattice and applies LA? with the guessed height as input, until all processes terminate.
Lattice agreement in asynchronous message passing systems is useful due to its
applications in atomic snapshot objects and faulttolerant replicated state machines. Efficient
implementation of atomic snapshot objects in crashprone asynchronous message passing
systems is important because they can make design of algorithms in such systems easier
(examples of algorithms in message passing systems based on snapshot objects can be found
in [18], [13] and [4]). As shown in [2], any algorithm for lattice agreement can be applied
to solve the atomic snapshot problem in a shared memory system. We note that [3] does
not directly use lattice agreement to solve the atomic snapshot problem, but their idea of
producing comparable views for processes is essentially lattice agreement. Thus, by using the
same transformation techniques in [2] and [3], algorithms for lattice agreement problem can
be directly applied to implement atomic snapshot objects in crashprone message passing
systems. We give an algorithm for asynchronous lattice agreement problem which requires
min{O(h(L)), O(f )} message delays. Then, by applying the technique in [3], our algorithm
can be used to implement atomic snapshot objects on top of crashprone asynchronous
message passing systems and achieve time complexity of O(f ) message delays in the worst case.
Our result significantly improves the message delays in the previous work by DelporteGallet,
Fauconnier et al [5]. The algorithm in [5] directly implements an atomic snapshot object on
top of crashprone message passing systems and requires O(n) message delays in the worst
case.
Another related work for lattice agreement in asynchronous systems is by Faleiro et
al. [7]. They solve the lattice agreement problem in asynchronous systems by giving a Paxos
style protocol [10, 11], in which each proposer keeps proposing a value until it gets accept
messages from a majority of acceptors. The acceptor only accepts a proposal when the
proposal has a bigger value than its accepted value. Their algorithm requires O(n) message
delays. Our asynchronous lattice agreement algorithm does not have Paxos style. Instead, it
runs in roundtrips. Each roundtrip is composed of sending a message to all and getting
n ? f acknowledgements back. Our algorithm guarantees termination in min{O(h(L)), O(f )}
message delays which is a significant improvement over O(n) message delays.
Generalized lattice agreement problem defined in [7] is a generalization of the lattice
agreement problem in asynchronous systems. It is applied to implement a specific class of
replicated state machines. In conventional replicated state machine approach [14], consensus
based mechanism is used to implement strong consistency. Due to performance reasons,
many systems relax the strong consistency requirement and support eventual consistency [17],
i.e, all copies are eventually consistent. However, there is no guarantee on when this eventual
consistency happens. Also, different copies could be in an inconsistent state before this
eventual situation happens. Conflictfree replicated data types (CRDT) [15, 16] is a data
structure which supports such eventual consistency. In CRDT, all operations are designed
to be commutative such that they can be concurrently executed without coordination. As
shown in [7] by applying generalized lattice agreement on top of CRDT, the states of any
two copies can be made comparable and thus provide linearizability guarantee [9] for CRDT.
The following example from [7] motivates generalized lattice agreement. Consider a
replicated set data structure which supports adds and reads. Suppose there are two concurrent
updates, add(a) and add(b), and two concurrent reads on copy one and two respectively.
By using CRDT, it could happen that the two reads return {a} and {b} respectively. This
execution is not linearizable [9], because if add(a) appears before add(b) in the linear order,
then no read can return {b}. On the other hand, if we use conventional consensus replicated
state machine technique, then all operations would be coordinated including the two reads.
This greatly impacts the throughput of the system. By applying generalized lattice agreement
on top of CRDT, all operations can be concurrently executed and any two reads always
return comparable views of the system. In the above example, the two reads return either
(i) {a} and {a, b} or (ii) {b} and {a, b} which is linearizable. Therefore, generalized lattice
agreement can be applied on top of CRDT to provide better consistency guarantee than
CRDT and better availability than conventional replicated state machine technique.
Since the generalized lattice agreement problem has applications in building replicated
state machines, it is important to reduce the message delays for a value to be learned. Faleiro
et al. [7] propose an algorithm for the generalized lattice agreement by using their algorithm
for the lattice agreement problem as a building block. Their generalized lattice agreement
algorithm satisfies safety and liveness assuming f < d n2 e. A value is eventually learned in
their algorithm after O(n) message delays in the worst case. Our algorithm guarantees that
a value is learned in min{O(h(L)), O(f )} message delays.
In summary, this paper makes the following contributions:
We present an algorithm, LA? to solve the lattice agreement in synchronous system
in log h(L) rounds assuming h(L) is known. Using LA?, we propose an algorithm,
LA? to solve the standard lattice agreement problem in log f rounds. This bound
is significantly better than the previously known upper bounds of log n by [3] and
min{1 + h(L), b(3 + ?8f + 1/2)c} by [12] (and solves the open problem posed there). We
also give an algorithm, LA? which runs in min{O(log2 h(L)), O(log2 f )} rounds.
For the lattice agreement problem in asynchronous systems, we give an algorithm, LA?
which requires 2 ? min{h(L), f + 1} message delays which improves the O(n) bound by [7].
Based on the asynchronous lattice agreement algorithm, we present an algorithm, GLA?,
to solve the generalized lattice agreement with time complexity min{O(h(L)), O(f )}
message delays which improves the O(n) bound by [7].
Related previous work and our results are summarized in Table 1. LA sync and LA async
represent lattice agreement in synchronous systems and asynchronous systems, respectively.
GLA async represents generalized lattice agreement in asynchronous systems. LA? is designed
to solve the lattice agreement problem with the assumption that the height of the input
lattice is given. It serves as a building block for LA? and LA?. For synchronous systems, the
time complexity is given in terms of synchronous rounds. For asynchronous system, the time
complexity is given in terms of message delays. The message column represents the total
number of messages sent by all processes in one execution. For generalized lattice agreement
problem, the message complexity is given in terms of the number of messages needed for a
value to be learned.
2
2.1
System Model and Problem Definitions
System Model
We assume a distributed message passing system with n processes in a completely connected
topology, denoted as p1, ..., pn. We consider both synchronous or asynchronous systems.
Synchronous means that message delays and the duration of the operations performed by the
process have an upper bound on the time. Asynchronous means that there is no upper bound
on the time for a message to reach its destination. The model assumes that processes may
have crash failures but no Byzantine failures. The model parameter f denotes the maximum
number of processes that may crash in a run. We assume that the underlying communication
system is reliable but the message channel may not be FIFO. We say a process is faulty in a
run if it crashes and correct or nonfaulty otherwise. In our following algorithms, when a
process sends a message to all, it also sends this message to itself.
2.2
Lattice Agreement
Let (X, ?, t) be a finite join semilattice with a partial order ? and join t. Two values u
and v in X are comparable iff u ? v or v ? u. The join of u and v is denoted as t{u, v}. X
is a join semilattice if a join exists for every nonempty finite subset of X. As customary in
this area, we use the term lattice instead of join semilattice in this paper for simplicity.
In the lattice agreement problem [2], each process pi can propose a value xi in X and
must decide on some output yi also in X. An algorithm is said to solve the lattice agreement
problem if the following properties are satisfied:
DownwardValidity: For all i ? [1..n], xi ? yi. UpwardValidity: For all i ? [1..n], yi ? t{x1, ..., xn}.
Comparability: For all i ? [1..n] and j ? [1..n], either yi ? yj or yj ? yi.
In this paper, all the algorithms that we propose apply join operation to some subset of
input values. Therefore, it is sufficient to focus on the joinclosed subset of X that includes
all input values. Let L be the joinclosed subset of X that includes all input values. L is
also a join semilattice. We call L the input sublattice of X. All algorithms proposed in this
paper are based on L. Since the complexity of our algorithms depend on the height of lattice
L, we give the formal definitions as below:
I Definition 1. The height of a value v in a lattice X is the length of longest path from any
minimal value to v, denoted as hX (v) or h(v) when it is clear.
I Definition 2. The height of a lattice X is the height of its largest value, denoted as h(X).
Each process proposes a value from a boolean lattice. Thus, the largest value in this
lattice is the set consists of all the n values. From the definition 2, we have h(L) ? n.
2.3
Generalized Lattice Agreement
In generalized lattice agreement problem, each process may receive a possibly infinite sequence
of values as inputs that belong to a lattice at any point of time. Let xip denote the ith value
p
received by process p. The aim is for each process p to learn a sequence of output values yj
which satisfies the following conditions:
Validity: Any learned value yjp is a join of some set of received input values.
Stability: The value learned by any process p is nondecreasing: j < k =? yjp ? ykp.
Comparability: Any two values yjp and ykq learned by any two process p and q are comparable.
Liveness: Every value xip received by a correct process p is eventually included in some
learned value ykq of every correct process q: i.e, xi ? yk.
p q
3
3.1
Lattice Agreement in Synchronous Systems
Lattice Agreement with Known Height
In this section, we first consider a simpler version of the standard lattice agreement problem
by assuming that the height of the input sublattice L is known in advance, i.e, h(L) is given.
We propose an algorithm, LA?, to solve this problem in log h(L) synchronous rounds. In
section 3.2, we give an algorithm to solve the lattice agreement problem when the height is
not given using this algorithm.
Algorithm LA? runs in synchronous rounds. At each round, by calling a Classifier
procedure (described below), processes within a same group (to be defined later) are classified
into different groups. The algorithm guarantees that any two processes within the same
group have equal values and any two processes in different groups have comparable values at
the end. Thus, values of all processes are comparable to each other at the end. We present
the algorithm by first introducing the fundamental Classifier procedure.
3.1.1
The Classifier Procedure
The Classifier procedure is inspired by the Classifier procedure given by Attiya and Rachman
in [3], called ARClasifier, where it is applied to solve the atomic snapshot problem in the
shared memory system. The intuition behind the Classifier procedure is to classify processes
to master or slave and ensure all master processes have values greater than all slave processes.
The pseudocode for Classifier is given in Figure 1. It takes two parameters: the input
value v and the threshold value k. The output is composed of three items: the output
value, the classification result and the decision status. The process which calls the Classifier
procedure should update their value to be the output value. The classification result is either
master or slave. The decision status is a Boolean value which is used to inform whether
the invoking process can decide on the output value or not. The main functionality of the
Classifier procedure is either to tell the invoking process to decide, or to classify the invoking
process as a master or a slave. Details of the Classifier procedure are shown below:
Line 13: The invoking process sends a message with its input value v and the threshold
value k to all. It then collects all the received values associated with the threshold value k in
a set U .
Line 56: It checks whether all values in U are comparable to the input value. If they are
comparable, it terminates the Classifier procedure and returns the input value as the output
value and true as the decision status.
Line 812: It performs classification based on received values. Let w be the join of all
received values associated with the threshold value k. If the height of w in lattice L is greater
than the threshold value k, then the Classifier returns w as the output value, master as the
classification result and false as the decision status. Otherwise, it returns the input value
as the output value, slave as the classification result and false as the decision status. From
the classification steps, it is easy to see that the processes classified as master have values
greater than those classified as slave because w is the join of all values in U .
There are four main differences between the ARClassifier and our Classifier : 1) The
ARClassifier is based on the shared memory model whereas our algorithm is based on
synchronous message passing. 2) The ARClassifier does not allow early termination. 3)
Each process in the ARClassifier needs values from all processes whereas our Classifier uses
values only from processes within its group. 4) The ARClassifier procedure requires the
invoking process to read values of all processes again if the invoking process is classified as
master where as our algorithm needs to receive values from all processes only once.
3.1.2
Algorithm LA?
Algorithm LA? (shown in Figure 2) runs in at most log h(L) rounds. It assumes knowledge
of H = h(L), the height of the input lattice. Let xi denote the initial input value of process
i, vir denote the value held by process i at the beginning of round r, and class denote the
classification result of the Classifier procedure. The class indicates whether the process
is classified as a master or a slave. The decided variable shows whether the process has
decided or not. Each process i has a label denoted as li. This label is updated at each round.
Processes which have the same label l are said to be in the same group with label l. The
definitions of label and group are formally given as:
I Definition 3 (label). Each process has a label, which serves as a knowledge threshold and
is passed as the threshold value k whenever the process calls the Classifier procedure.
I Definition 4 (group). A group is a set of processes which have the same label. The label
of a group is the label of the processes in this group.
Classifier (v, k):
v: input value k: threshold value
1: Send (v, k) to all
2: Receive messages of the form (?, k)
3: Let U be values contained in received messages
3:
4: /* Early Termination */
5: if U  = 0 or ?u ? U : v ? u ? u ? v
6: return (v, ?, true)
6:
7: /* Classification */
8: Let w := t{u : u ? U }
9: if h(w) > k
10: return (w, master, f alse)
11: else
12: return (v, slave, f alse)
A process has decided if it has set its decision status to true. Otherwise, it is undecided.
At each round r, an undecided process invokes the Classifier procedure with its current value
and its current label li as parameters v and k, respectively. Since each process passes its
label as the threshold value k when invoking the Classifier procedure, line 2 of the Classifier
is equivalent to receiving messages from processes within the same group; that is, at each
round, a process performs the Classifier procedure within its group. Processes which are in
different groups do not affect each other. At round r, by invoking the Classifier procedure,
each process i sets vir+1, class and decided to the returned output value, the classification
result and the decision status. Each process first checks the value of decided. If it is true,
process i decides on vr+1 and terminates the algorithm. Otherwise, if it is classified as a
i
master, it increases its label by 2rH+1 . If it is classified as a slave, it decreases its label by 2rH+1 .
Now we show how the Classifier procedure combined with this label update mechanism
makes any two processes have comparable values at the end.
Let G be a group of processes at round r. Let M (G) and S(G) be the group of processes
which are classified as master and slave, respectively, when they run the Classifier procedure
in group G. We say that G is the parent of M (G) and S(G). Thus, M (G) and S(G) are
both groups at round r + 1. Process i ? M (G) or i ? S(G) indicates that i does not decide
in group G at round r. Initially, all process have the same label H2 and are in the same group
with label H2 . When they execute the Classifier, they will be classified into different groups.
We can view the execution as processes traversing through a binary tree. Initially, all of
them are at the root of the tree. As the program executes, if they are classified as master,
then they go to the right child. Otherwise, they go to the left child.
Before we prove the correctness of the given algorithm, we first give some useful properties
satisfied by the Classifier procedure. Although Lemma 5 is similar to a lemma given in [5],
it is discussed here in message passing systems and the proofs are different.
I Lemma 5. Let G be a group at round r with label k. Let L and R be two nonnegative integers
such that L ? k ? R. If L < h(vir) ? R for every process i ? G, and h(t{vir : i ? G}) ? R,
then
(p1) for each process i ? M (G), k < h(vir+1) ? R
(p2) for each process i ? S(G), L < h(vir+1) ? k
(p3) h(t{vvirr++11 :: ii ?? SM(G(G)})}))??kR,and
(p4) h(t{ i
(p5) for each process i ? M (G), vir+1 ? t{vir+1 : i ? S(G)}
Proof.
(p1)?(p3): Immediate from the Classifier procedure.
(p4): Since S(G) is a group of processes which are at round r + 1, all processes in S(G) are
correct (nonfaulty) at round r. So, all processes in S(G) must have received values of each
other in the Classifier procedure at round r in group G. Thus, h(t{vir+1 : i ? S(G)}) ? k,
otherwise all of them should be in group M (G) instead of S(G), according to the condition
at line 9 of the Classifier procedure.
(p5): Since all processes in S(G) are correct at round r, all processes in M (G) must have
received values of all processes in S(G) in the Classifier procedure at round r. Any
process which proceeds to group M (G) takes the join of all received values at round r,
according to line 10. Thus, for every process i ? M (G), vir+1 ? t{vir+1 : i ? S(G)}. J
I Lemma 6. Let x be a value from a lattice L, and V be a set of values from L. Let U be
any subset of V . If x is comparable with ? v ? V , then x is comparable with t{u  u ? U }.
Proof. If ?u ? U : u ? x, then t{u  u ? U } ? x. Otherwise, ?y ? U : x ? y. Since
y ? t{u  u ? U }, so x ? t{u  u ? U }. J
I Lemma 7. If process i decides at round r on value yi, then yi is comparable with vjr for
any correct process j.
Proof. Let process i decide in group G at round r. Consider the two cases below:
Case 1: j 6? G. Let G0 be a group at the maximum round r0 such that both i and j belong
to G0. Then, either i ? M (G0) ? j ? S(G0) or j ? M (G0) ? i ? S(G0). We only consider the
case i ? M (G0) ? j ? S(G0). The other case is similar. From (p5) of Lemma 5, we have
t{vpr : p ? S(G0)} ? yi. Since j ? S(G0), then vj ? t{vpr : p ? S(G0)}. Thus, vjr ? yi. For
r
the other case, we have yi ? vjr. Therefore, yi is comparable with vjr.
Case 2: j ? G, since process j is correct, then i must have received vjr at round r. Thus,
by line 5 of the Classifier procedure, we have that yi is comparable with yjr. J
Now we show that any two processes decide on comparable values.
I Lemma 8. (Comparability) Let process i and j decide on yi and yj, respectively. Then yi and yj are comparable.
Proof. Let process i and j decide at round ri and rj, respectively. Without loss of generality,
assume ri ? rj. At round ri, from Lemma 7 we have yi is comparable with vkr for any correct
vri
undecided process k. Let V = { k  process k undecided and correct}. Since rj ? ri, yj is
at most the join of a subset of V . Thus, from Lemma 6 we have yi and yj are comparable. J
Now we prove that all processes decide within log H + 1 rounds by showing all processes
in the same group at the beginning of round log H + 1 have equal values, given by Lemma 9
and Lemma 10. Since Lemma 9 and Lemma 10 and the corresponding proofs are similar to
the ones given in [3], the proofs are omitted here and can be found in the full paper. Proof
of Lemma 9 is based on (p1p4) of Lemma 5 by induction. Proof of Lemma 10 is based on
Lemma 9.
I Lemma 9. Let G be a group of processes at round r with label k. Then
(1) for each process i ? G, k ? 2Hr < h(vir) ? k + 2Hr
(2) h(t{vir : i ? G}) ? k + 2Hr
I Lemma 10. Let i and j be two processes that are within the same group G at the beginning
of round r = log H + 1. Then vir and vjr are equal.
I Lemma 11. All processes decide within log H + 1 rounds.
Proof. From Lemma 10, we know any two processes which are in the same group at the
beginning of round log H + 1 have equal values. Then, the condition in line 5 of Classifier
procedure is satisfied. Thus, all undecided processes decide at round log H + 1. J
I Remark 12. Since at the beginning of round log H + 1 all undecided processes have
comparable values, LA? only needs log H rounds. For simplicity, one more round is executed
to make all processes decide at line 5 of the Classifier procedure.
I Theorem 13. Algorithm LA? solves lattice agreement problem in log H rounds and can
tolerate f < n failures.
Proof. DownwardValidity follows from the fact that the value of each process is
nondecreasing at each round. For UpwardValidity, according to the Classifier procedure, each
process either keeps its value unchanged or takes the join of the values proposed by other
processes which could never be greater than t{x1, ..., xn}. For Comparability, from Lemma
8, we know for any two process i and j, if they decide, then their decision values must be
comparable. From Lemma 11, we know all processes decide. Thus, comparability holds. J
Complexity. Time complexity is log H rounds. For message complexity, since each process
sends n messages per round, log H rounds results in n2 log H messages in total. Notice that
the number of messages can be further reduced by keeping a set of processes which are not
in its group. If a process p receives a message from process q with a threshold value different
from its own threshold value, it knows that q is not in its group. Each process does not send
messages to the processes in this set.
Algorithm LA? runs in log height(L) rounds by assuming that height(L) is given.
However, in order to know that actual height of input lattice, we need to know how many distinc
values all process propose which needs extra effort. For this reason, in following sections, we
introduce algorithms to solve the lattice agreement problem without this assumption.
3.2
Lattice Agreement with Unknown Height
In this section, we consider the standard lattice agreement in which the height of the lattice
is not known to any process. We propose algorithm, LA? , (shown in Figure 3) based on
algorithm LA?.
3.2.1
Algorithm LA?
Algorithm LA? runs in log f + 1 synchronous rounds. It makes use of algorithm LA? as a
building block. Instead of directly agreeing on input values which are taken from a lattice
with unknown height, we first do lattice agreement on the failure set that each process knows
after one round of broadcast. The set of all failure sets forms a boolean lattice with union
as the join operation and with height equal to f (since there are at most f failures). The
algorithm consists of two phases. At Phase A, all processes exchange their values. Process i
includes j into its failure set if it does not receive value from process j at the first phase.
After the first phase, each process has a failure set which contains failed processes it knows.
Then in phase B, they invoke algorithm LA? with f as the height and its failure set as
input. After that, each process decides on a failure set which satisfies lattice agreement
properties. The new failure set of any two process i and j are comparable to each other, i.e,
Fi0 is comparable to Fj0 . Equipped with this comparable failure set, each process removes
values it received from processes which are in its failure set and decides on the join of the
remaining values.
The following lemma shows that any two processes decide on comparable values. We only
give the sketch of proof, and the detailed proof is available in the full paper.
I Lemma 14. (Comparability) Let process i and j decide on yi and yj , respectively. Then yi and yj are comparable.
Proof sketch. According to comparability of LA?, all processes have comparable failure sets.
Then, the set of values they received at Phase A from correct processes must be comparable,
i.e, Ci is comparable with Cj . Therefore, yi and yj are comparable. J
I Theorem 15. LA? solves lattice agreement problem in log f + 1 rounds, where f < n is
the maximum number of failures the system can tolerate.
Proof. DownwardValidity. Initially, for correct process i, vi = xi. After Phase A, since i is
correct, so i is not in any failure set of any process. At Phase B, process i invokes algorithm
LA? with failure set as the input value. Thus, according to the UpwardValidity of LA?, i
is not included in Fi0 . So, xi ? Ci. Therefore, xi ? yi. UpwardValidity is immediate from
the fact that each process receives at most all values by all processes. Comparability follows
from Lemma 14. J
3.2.2
Algorithm LA?
Algorithm LA? solves lattice agreement in log f + 1 rounds whereas Algorithm LA? solves
lattice agreement in log h(L) rounds assuming h(L) is given. We now propose an algorithm
to solve lattice agreement which has round complexity related to h(L) even when h(L) is not
known. This algorithm called LA? (shown in Figure 4), solves the standard lattice agreement
in O(min{log2 h(L), log2 f }) rounds. The basic idea is to ?guess? the height of L and apply
algorithm LA? using the guessed height as input. The algorithm is composed of two phases.
At Phase A, each process simply broadcasts its value and takes the join of all received values.
Phase B is the guessing phase which invokes algorithm LA? repeatedly. Notice that decided
variable is updated at line 6 of LA?.
Let wi denote the value of vi after Phase A. Let ? denote the sublattice formed by values
of all correct processes after Phase A, i.e, ? = {u  (u ? L) ? (?i : wi ? u)}. Since there are
at most f failures, we have h(?) ? f . Now we show that Phase B terminates in at most
dlog h(?)e executions of LA?. We call the ith execution of LA? as iteration i. Notice that
the guessed height of iteration i is 2i.
I Lemma 16. After iteration dlog h(?)e of LA? at Phase B, all processes decide.
Proof. Since 2dlog h(?)e ? h(?), Lemma 9 still holds which implies Lemma 10. Thus, all
undecided processes have equal values at the last round of iteration dlog h(?)e. Therefore,
all undecided processes decide after iteration dlog h(?)e. J
We now show that two processes decide on comparable values irrespective of whether
they both decide on the same iteration of LA?.
I Lemma 17. (Comparability) Let i and j be any two processes that decide on value yi and yj , respectively. Then yi and yj are comparable.
Proof. Assume process i decides on Gi at round ri of execution ei of LA? and process j
decides on Gj at round rj of execution ej of LA?. If ei = ej , then yi and yj are comparable
by Lemma 8. Otherwise, ei 6= ej . Without loss of generality, suppose ei < ej . Consider round
ri of execution ei of LA?. Since i decides on value yi at this round, then from Lemma 7, we
have that yi is comparable with vkr for any correct process k. Let V = {vk  k is correct}.
r
Then, yj is at most the join of a subset of V . From Lemma 6, it follows that yi is comparable
with yj . J
I Theorem 18. LA? solves the lattice agreement problem and can tolerate f < n failures.
Proof. DownwardValidity follows from that fact that the value of each process is
nondecreasing along the execution. UpwardValidity follows since each process can receive at
most all values from all processes. Comparability holds by Lemma 17. J
Complexity. From Lemma 16, we know Phase B terminates in at most dlog h(?)e executions
of LA?. Thus, Phase B takes log 2 + log 4 + ... + dlog h(?)e = (dlog h(?)e+1)?(dlog h(?)e)
2
rounds in worst case. Since h(?) ? f and h(?) ? h(L), LA? has round complexity of
min{O(log2 h(L)), O(log2 f )}. Each process sends n messages at each round, thus message
complexity is n2 ? min{O(log2 h(L)), O(log2 f )}.
LA? for pi
acceptVal := xi// accept value
learnedV al := ? // learned value
on receiving prop(vj , r) from pj :
if vj ? acceptVal
Send ACK (?accept?, ?, r)
acceptVal := vj
else
Send ACK (?reject?, acceptVal, r)
In this section, we discuss the lattice agreement problem in asynchronous systems. The
algorithm proposed in [7] requires O(n) units of time, whereas our algorithm (LA? shown in
Figure 5) requires only O(f ) units of time. We first note that
I Theorem 19. The lattice agreement problem cannot be solved in asynchronous message
n .
systems if f ? 2
Proof. The proof follows from the standard partition argument. If two partitions have
incomparable values then they can never decide on comparable values. J
4.1
Algorithm LA?
On account of Theorem 19, we assume that f < n2 . The algorithm proceeds in roundtrips. A
single roundtrip is composed of sending messages to all and getting n ? f acknowledgement
messages back. At each roundtrip, a process sends a prop message to all, with its current
accepted value as the proposal value, and waits for n ? f ACK messages. If majority of
these ACK messages are accept, then it decides on its current proposed value. Otherwise, it
updates its current accept value to be the join of all values received and starts next roundtrip.
Whenever a process receives a proposal, i.e, a prop message, if the proposal has a value at
least as big as its current value, then it sends back an ACK message with accept and updates
its current accept value to be the received proposal value. Otherwise, it sends back an ACK
message with reject.
Let acceptV alir denote the accept value (variable acceptVal) held by pi at the beginning
of roundtrip r. Let L(r) = {u  (u ? L) ? (?i : acceptV alir ? u)}, i.e, L(r) denotes the
joinclosed subset of L that includes the accept values held by all undecided processes at the
beginning of the roundtrip r. Notice that L(1) = L.
I Lemma 20. For any roundtrip r, h(L(r+1)) < h(L(r)).
Proof. If a process decides at roundtrip r, its value is not in L(r+1). So, we only need to prove
that h(acceptV alir) < h(acceptV alir+1) for any process i which does not decide at roundtrip
r. The fact that process i does not decide at roundtrip r implies that i must have received at
least one reject ACK with a greater value. Since acceptV alir+1 is the join of all values received
at roundtrip r, acceptV alir < acceptV alir+1. Hence, h(acceptV alir) < h(acceptV alir+1) for
any undecided process i. Therefore, h(L(r+1)) < h(L(r)). J
I Lemma 21. All process decide within min{h(L), f + 1} asynchronous roundtrips.
Proof. We first show that h(L(2)) ? f . At the first roundtrip, each process receives
n ? f ACKs, which is equivalent to receiving n ? f values. Therefore, h(L(2)) ? f . Let
rmin = min{h(L), f + 1}. Combining the fact that h(L(2)) ? f with Lemma 20, we have
h(L(rmin)) ? 1. This means that undecided correct processes have the same value. Thus, all
of them receive n ? f ACK messages with accept and decide. Therefore, all processes decide
within min{h(L), f + 1} roundtrips. J
We note here that the algorithm in [7] takes O(n) message delays for a value to be learned
in the worst case. A crucial difference between LA? and the algorithm in [7] is that LA?
starts with the accepted value as the input value. Hence, after the first roundtrip, there is a
significant reduction in the height of the sublattice, from n initially (in the worst case) to f .
In [7], acceptors start with the accepted value as null. Hence, there is reduction of height by
only 1 in the worst case. Since in their algorithm, acceptors are different from proposers (in
the style of Paxos), acceptors do not have access to the proposed values.
I Theorem 22. Algorithm LA? solves the lattice agreement problem in min{h(L), f + 1}
roundtrips.
Proof. DownValidity holds since the accept value is nondecreasing for any process i.
UpwardValidity follows because each learned value must be the join of a subset of all initial
values which is at most t{x1, ..., xn}. For Comparability, suppose process i and j decide on
values yi and yj. There must be at least one process that has accepted both yi and yj. Since
each process can only accept comparable values. Thus, we have either yi ? yj or yj ? yi. J
Complexity. From Lemma 21, we know that LA? takes at most min{h(L), f + 1}
roundtrips, which results in 2 ? min{h(L), f + 1} message delays, since one roundtrip takes two
message delays. At each roundtrip, each process sends out at most 2n messages. Thus, the
number of messages for all processes is at most 2 ? n2 ? min{h(L), f + 1}.
5
Generalized Lattice Agreement
In this section, we discuss the generalized lattice agreement problem as defined in Section
2.3. Since it is easy to adapt algorithms for lattice agreement in synchronous systems to
solve generalized lattice agreement problem, we only consider asynchronous systems. We
show how to adapt LA? to solve the generalized lattice agreement problem (algorithm GLA?
shown in Figure 6) in min{O(h(L)), O(f )} units of time.
5.1
Algorithm GLA?
GLA? invokes the Agree() procedure to learn a new value multiple times. The Agree()
procedure is an execution of LA? with some modifications (to be given later). A sequence
number is associated with each execution of the Agree() procedure, thus each correct process
has a learned value for each sequence number. The basic idea of GLA? is to let all processes
sequentially execute LA? to learn values, and make sure: 1) any two learned values for the
same sequence number are comparable, 2) any learned value for a bigger sequence number is
at least as big as any learned value for a smaller sequence number. The first goal can be
simply achieved by invoking LA? with the sequence number. In order to achieve the second
goal, the key idea is to make any proposal for sequence number s + 1 to be at least as big
as the largest learned value for sequence number s. Notice that at each roundtrip of LA?
execution, a process waits for n ? f ACKs, and any two set of n ? f processes have at least
one process in common. Thus, the second goal can be achieved by making sure at least n ? f
processes know the largest learned value after execution of LA? for a sequence number.
Upon receiving a value v from client in a message tagged with ClientValue, a process adds
v into its buffer and sends a ServerValue message with v to all other processes. The process
can start to learn new values only when it succeeds at its current proposal. Otherwise, LA?
may not terminate, as shown by an example in [7]. Upon receiving a ServerValue message
with value v, a process simply adds v to its buffer.
The Agree() procedure is automatically executed when the guard condition is satisfied;
that is, it is not currently proposing a value and it has some value in its buffer or it has seen
a sequence number bigger than its current sequence number. Inside the Agree() procedure, a
process first updates its acceptVal to be the join of current acceptVal and buffVal. Then, it
starts an adapted LA? execution. The original LA? and adapted LA? differ in the following
ways: 1) Each message in the adapted LA? is associated with a sequence number. 2) A
process can also decide on a value for a sequence number if it receives any decide ACK
message for that sequence number. 3) On receiving a prop message associated with a sequence
number s0, if s0 is smaller than its current sequence number which means it has learned a
value for s0, then it simply sends ACK message with its learned value for s0 back. If s0 is
greater than its current sequence number, it updates its maxSeq and waits until its current
sequence number matches s0. After that it sends back ACK message with accept or reject
based on whether the proposal value is bigger than its current accept value or not. The
reason a process keeps track of the maximal sequence number it has ever seen, is to make sure
each process has a learned value for each sequence number. When the maximum sequence
number is bigger than its current sequence number, it has to invoke Agree() procedure even
if it does not have any new value to propose. After execution of adapted LA?, a process
increases its current sequence number.
We next show the correctness of GLA?. Let acceptV alsp denote the acceptVal of process p
at the end of Agree() procedure for sequence number s. Let LVp denote the map of sequence
number to learned value (variable LV ) for process p and ms = t{LVp[s] : p ? [1..n]}, i.e,
ms denotes the join of all learned values for sequence number s. Let LPs = {p  (p ?
[1..n]) ? (ms ? acceptV alsp)}, i.e, LPs is the set of processes which have acceptVal greater
than the join of all learned values for the sequence number s. Notice that a process has
two ways to learn a value for its current sequence number in the Agree() procedure: 1) by
receiving a majority of accept ACKs. 2) by receiving some decide ACKs.
The following lemma proves that the adapted LA? satisfies the first goal.
I Lemma 23. For any sequence number s, LVp[s] is comparable with LVq[s] for any two
processes p and q.
Proof. We only need to show that any two processes which learn by the first way must learn
comparable values, since processes which learn by the second way simply learn values from
processes which learn by the first way. By the same reasoning as Comparability of Theorem
22, we know this is true. J
From Lemma 23, we know that ms is the largest learned value for sequence number s.
I Lemma 24. For any sequence number s, LPs > n2 .
GLA? for pi
s := 0 // sequence number
maxSeq := 1 // max seq number seen
buffVal := ? // received values
/* map from seq to learned value */
LV := ?
acceptVal := ?
active := false
on receiving ClientValue(v):
buffVal := buffVal t v
Send ServerValue(v) to all
on receiving ServerValue(v):
buffVal := buffVal t v
on receiving prop(vj, r, s0) from pj:
if s0 < s
Send ACK (?decide?, LV [s0], r, s0)
break
if s0 > s
maxSeq := max{s0, maxSeq}
wait until s = s0
if vj ? acceptVal
Send ACK (?accept?, ?, r, s0)
acceptVal := vj
else
Send ACK (?reject?, acceptVal, r, s0)
Proof. Consider Agree() procedure for s. Since ms is the largest learned value for sequence
number s, there must exist a process p which learns ms by the first way. Thus, p must
have received a majority of accept ACKs, which means at least a majority of processes have
acceptVal greater than ms after Agree() procedure for s. Therefore, LPs > n2 . J
The lemma below shows that GLA? achieves the second goal.
I Lemma 25. ms ? LVp[s + 1] for any process p and any sequence number s.
Proof. From Lemma 24, we know for sequence number s at least a majority of processes
have acceptVal greater than ms. To decide on LVp[s + 1], process p must get majority accept.
Since any two majority has at least one process in common, ms ? LVp[s + 1]. J
I Theorem 26. Algorithm GLA? solves generalized lattice agreement when a majority of
processes is correct.
Proof. Validity holds since any learned value is the join of a subset of values received.
Stability. From Lemma 25 and the fact that LVp[s] ? ms, we have that LVp[s] ? LVp[s+1]
for any process p and any sequence number s, which implies Stability.
Comparability. We need to show that LVp[s] and LVq[s0] are comparable for any two
processes p and q, and for any two sequence number s and s0. If s = s0, this is immediate
from Lemma 23. Now consider the case when s 6= s0. Without loss of generality, assume
s < s0. From Lemma 25, we can conclude that LVp[s] ? LVq[s0]. Thus, comparability holds.
Liveness. Any received value v is eventually included in some proposal, i.e, prop message.
From Theorem 22, we know that in at most 2 ? min{h(L), f + 1} message delays that proposal
value will be included in some learned value. Thus, v will be learned eventually. J
Complexity. For time complexity, from the analysis for liveness in Theorem 26, we know
that a received value is learned in at most 2 ? min{h(L), f + 1} message delays. For message
complexity, since each process sends out n messages per roundtrip, the total number of
messages needed to learn a value is 2 ? n2 ? min{h(L), f + 1}.
6
Conclusions
We have presented algorithms for the lattice agreement problem and the generalized lattice
agreement problem. These algorithms achieve significantly better time complexity than
previous algorithms. For future work, we would like to know the answers to the following
two questions: 1) Is log f rounds a lower bound for lattice agreement in synchronous message
passing systems? 2) Is O(f ) message delays optimal for the lattice agreement and generalized
lattice agreement problem in asynchronous message passing systems?
1
2
3
4
5
6
7
8
9
11
12
13
14
15
16
17
18
Atomic snapshots of shared memory . Journal of the ACM (JACM) , 40 ( 4 ): 873  890 , 1993 .
Hagit Attiya , Maurice Herlihy, and Ophir Rachman . Atomic snapshots using lattice agreement . Distributed Computing , 8 ( 3 ): 121  132 , 1995 .
Hagit Attiya and Ophir Rachman . Atomic snapshots in O(nlogn) operations . SIAM Journal on Computing , 27 ( 2 ): 319  340 , 1998 .
Hagit Attiya and Jennifer Welch . Distributed computing: fundamentals, simulations, and advanced topics , volume 19 . John Wiley & Sons, 2004 .
Carole DelporteGallet , Hugues Fauconnier, Sergio Rajsbaum, and Michel Raynal . Implementing snapshot objects on top of crashprone asynchronous messagepassing systems . In International Conference on Algorithms and Architectures for Parallel Processing , pages 341  355 . Springer, 2016 .
SIAM Journal on Computing , 12 ( 4 ): 656  666 , 1983 .
Generalized lattice agreement . In Proceedings of the 2012 ACM symposium on Principles of distributed computing , pages 125  134 . ACM, 2012 .
Michael J Fischer , Nancy A Lynch, and Michael S Paterson. Impossibility of distributed consensus with one faulty process . Journal of the ACM (JACM) , 32 ( 2 ): 374  382 , 1985 .
Maurice P Herlihy and Jeannette M Wing. Linearizability : A correctness condition for concurrent objects . ACM Transactions on Programming Languages and Systems (TOPLAS) , 12 ( 3 ): 463  492 , 1990 .
Leslie Lamport . The parttime parliament . ACM Transactions on Computer Systems (TOCS) , 16 ( 2 ): 133  169 , 1998 .
Leslie Lamport et al. Paxos made simple . ACM Sigact News , 32 ( 4 ): 18  25 , 2001 .
Marios Mavronicolas . A bound on the rounds to reach lattice agreement , 2000 . URL: http://www.cs.ucy.ac.cy/~mavronic/pdf/lattice.pdf.
Michel Raynal . Concurrent programming: algorithms, principles , and foundations. Springer Science & Business Media , 2012 .
Fred B Schneider. Implementing fault tolerant services using the state machine approach: A tutorial . ACM Computing Surveys (CSUR) , 22 ( 4 ): 299  319 , 1990 .
Marc Shapiro , Nuno Pregui?a , Carlos Baquero, and Marek Zawirski . Conflictfree replicated data types . In Symposium on SelfStabilizing Systems , pages 386  400 . Springer, 2011 .
Marc Shapiro , Nuno Pregui?a , Carlos Baquero, and Marek Zawirski . Convergent and commutative replicated data types . BulletinEuropean Association for Theoretical Computer Science , 104 : 67  88 , 2011 .
Andrew S Tanenbaum and Maarten Van Steen . Distributed systems: principles and paradigms . PrenticeHall, 2007 .
Gadi Taubenfeld . Synchronization algorithms and concurrent programming . Pearson Education , 2006 .