LooselyStabilizing Leader Election on Arbitrary Graphs in Population Protocols Without Identifiers nor Random Numbers
O P O D I S
LooselyStabilizing Leader Election on Arbitrary Graphs in Population Protocols Without Identifiers nor Random Numbers?
Yuichi Sudo 0
Fukuhito Ooshita 0
Hirotsugu Kakugawa 0
Toshimitsu Masuzawa 0
0 NTT Secure Platform Laboratories , Tokyo , Japan; and Graduate School of Information Science and Technology, Osaka University , Osaka , Japan Graduate School of Information Science, Nara Institute of Science and Technology, Nara, Japan Graduate School of Information Science and Technology, Osaka University , Osaka , Japan Graduate School of Information Science and Technology, Osaka University , Osaka , Japan
In the population protocol model Angluin et al. proposed in 2004, there exists no selfstabilizing leader election protocol for complete graphs, arbitrary graphs, trees, lines, degreebounded graphs and so on unless the protocol knows the exact number of nodes. To circumvent the impossibility, we introduced the concept of loosestabilization in 2009, which relaxes the closure requirement of selfstabilization. A looselystabilizing protocol guarantees that starting from any initial configuration a system reaches a safe configuration, and after that, the system keeps its specification (e.g. the unique leader) not forever, but for a sufficiently long time (e.g. exponentially large time with respect to the number of nodes). Our previous works presented two looselystabilizing leader election protocols for arbitrary graphs; One uses agent identifiers and the other uses random numbers to elect a unique leader. In this paper, we present a looselystabilizing protocol that solves leader election on arbitrary graphs without agent identifiers nor random numbers. By the combination of viruspropagation and tokencirculation, the proposed protocol achieves polynomial convergence time and exponential holding time without such external entities. Specifically, given upper bounds N and ? of the number of nodes n and the maximum degree of nodes ? respectively, it reaches a safe configuration within O(mn3d + mN ?2 log N ) expected steps, and keeps the unique leader for ?(N eN ) expected steps where m is the number of edges and d is the diameter of the graph. To measure the time complexity of the protocol, we assume the uniformly random scheduler which is widely used in the field of the population protocols. 1998 ACM Subject Classification G.2.2. Graph Theory
and phrases Loosestabilization; Population protocols; and Leader election

1
Introduction
This paper focuses on selfstabilizing leader election in the population protocol model. The
population protocol (PP) model, which was presented by Angluin et al. [1], represents wireless
sensor networks of mobile sensing devices that cannot control their movement. Two devices
(say agents) communicate with each other and change their states only when they come
sufficiently close to each other (we call this event an interaction). Selfstabilizing leader
election (SSLE) requires that starting from any configuration, a system (say population)
reaches a safeconfiguration in which a unique leader is elected, and after that, the population
has the unique leader forever. Selfstabilizing leader election is important in the PP model
because (i) many population protocols in the literature work on the assumption of the unique
leader [1, 2, 3], and (ii) selfstabilization tolerates any finite number of transient faults and
this property suits systems consisting of numerous cheap and unreliable nodes. (Such systems
are the original motivation of the PP model.) However, there exists strict impossibility of
SSLE in the PP model: no protocol solves SSLE for complete graphs, arbitrary graphs,
trees, lines, degreebounded graphs and so on unless the number of agents n is available to
agents in advance [3].
Therefore, many studies of SSLE took either one of the following two approaches. One
approach is to accept the assumption that the exact n is available and focus on the space
complexity of the protocol. Cai et al. [6] proved that n states of each agent is necessary and
sufficient to solve SSLE for a complete graph of n agents. Mizoguchi et al. [12] and Xu et al.
[15] improved the spacecomplexity by adopting the mediated population protocol model [10]
and the P Pk model [5] respectively. The other approach is to use oracles, a kind of failure
detectors. Fischer and Jiang [8] took this approach for the first time. They introduced oracle
?? that informs all agents whether a leader exists or not and proposed two protocols that
solve SSLE for rings and complete graphs by using ??. Beauquier et al. [4] presented an
SSLE protocol for arbitrary graphs that uses two copies of ??. Canepa et al. [7] proposed
two SSLE protocols that use ?? and consume only 1 bit of each agent: one is a deterministic
protocol for trees and the other is a probabilistic protocol for arbitrary graphs although the
position of the leader is not static and moves among the agents.
Our previous works [13, 14] took another approach to solve SSLE. We introduced the
concept of loosestabilization, which relaxes the closure requirement of selfstabilization.
Specifically, starting from any initial configuration, the population must reach a safe
configuration within a relatively short time; after that, the specification of the problem (the
unique leader) must be kept for a sufficiently long time, though not forever. We proposed
three looselystabilizing protocols PLE, PID, and PRD. Protocol PLE solves leader election
for complete graphs whose size is no more than given upper bound N of n. Protocol PID and
PRD solve leader election for arbitrary graphs using agent identifiers and random numbers
respectively, given N and upper bound ? of the maximum degree of nodes ?. All the three
protocols are practically equivalent to a SSLE protocol since they keep the specification for
an exponentially long time after reaching a safe configuration (and reaches a safe configuration
within polynomial time).
Some works on population protocols assume the probabilistic distribution regarding
the interactions of agents: any interaction occurs uniformly at random [1, 2, 9, 13, 14].
This assumption have been used mainly for evaluating the time complexity of protocols.
We also adopt this assumption because the measure of time is crucial in the concept of
loosestabilization. The impossibility result for SSLE [1] still holds even with this assumption.
Our Contribution
This paper proposes a looselystabilizing protocol PAR for leader election in arbitrary graphs
without agent identifiers nor random numbers (or a model with a weaker assumption than
PID or PRD). Thus, we succeed to remove the assumptions of unique identifiers and random
number generators for a looselystabilizing leader election on arbitrary graphs in the PP
model, which may be difficult to realize in weak computation models, like the PP model,
consisting of huge number of tiny devices with restricted capability.
The expected convergence time and the expected holding time of PID, PRD, and PAR
are shown in Table 1 where d is the diameter of the graph. All the protocols including
PAR keep the unique leader for an exponentially long time (?(N eN ) interactions) after a
safe configuration. Protocol PAR consumes O(log N ) bits of each agent?s memory while
any selfstabilizing protocol (which uses knowledge of exact n) consumes ?(log n) memory
[6]. Furthermore, Izumi [9] proves that looselystabilizing leader election with polynomial
convergence time and exponentially long holding time needs ?(log n) agent memory. Thus,
PAR is asymptotically spaceoptimal when N is polynomial in n. One may think that the
model of anonymous agents and O(log N ) agent memory is not wellmotivated because
O(log n) memory is sufficient to store an identifier. However, we believe that anonymity is
still an important assumption: assigning distinct identifiers to a huge number of agents is
not an easy task, and memory corruption may cause conflicts of identifiers of different agents.
Actually, many works assume anonymity and agent memory space of O(log n) or more (e.g.
[3, 6, 12, 13, 14, 15]). In this paper, we analyze time complexities for undirected graphs for
simplicity, however, it works on any directed graphs without modifications.
While protocol PAR is based on the virus war mechanism developed for PRD [14], the key
idea of PAR is quite novel and has a considerable contribution: The token with a countdown
timer circulates in the graph, and a leader creates and spreads a black or white virus when
encountering the token with zero timer value. The idea of circulating tokens and the colors
of viruses are newly introduced to remove the assumption of random number generators.
This technique may be useful also for other problems and/or other models.
The formal analysis of the convergence time and the holding time is another main
contribution of this paper, since analyzing such complexities of looselystabilizing protocols is
a challenging task. In particular, we analyze in the expected time until two tokens performing
random walks meet in the PP model. The analysis can be applied with slight modification
to estimate the expected time until a token performing random walks visits all nodes. We
believe that the analysis techniques are of significant importance because existing analysis for
usual random walks cannot be applied to the population protocol model: the token always
moves through an edge at each step in usual random walks while, in the population protocol
model, the token moves at each step with a probability depending on the degree of the node
the token currently exists on. Thus, the techniques we developed open up a new path to
analysis of looselystabilizing protocols in the PP model.
Angluin et al. [1] proves that for any population protocol P working on complete graphs,
there exists a protocol that simulates P on any arbitrary graph. One may think that this
simulator can translate our previous looselystabilizing algorithm for complete graphs [13]
to a looselystabilizing algorithm that works for arbitrary graphs. However, it cannot work
since, in this simulation, two agents swap their states when they have interactions. This
swap is needed to simulate interactions between distant agents in an arbitrary graph, but it
results in the execution where an elected leader moves among the population, which does
not satisfy the specification of the leader election.
2
Preliminaries
This section defines the model we consider for this paper.
A population is a simple and weaklyconnected directed graph G(V, E) where V (V  ? 2)
is a set of agents and E ? V ? V is a set of directed edges. Each edge represents a possible
interactions (or communication between two agents): If (u, v) ? E, agents u and v can
interact with each other where u serves as an initiator and v serves as a responder. We say
that G is undirected if it satisfies (u, v) ? E ? (v, u) ? E. We define n = V  and m = E.
A protocol P (Q, Y, T, O) consists of a finite set Q of states, a finite set Y of output
symbols, transition function T : Q ? Q ? Q ? Q, and output function O : Q ? Y . When an
interaction between two agents occurs, T determines the next states of the two agents based
on their current states. The output of an agent is determined by O: the output of agent v
with state q ? Q is O(q).
A configuration is a mapping C : V ? Q that specifies the states of all the agents. We
denote the set of all configurations of protocol P by Call(P ). We say that configuration C
changes to C0 by interaction e = (u, v), denoted by C ?e C0, if we have (C0(u), C0(v)) =
T (C(u), C(v)) and C0(w) = C(w) for all w ? V \ {u, v}. A scheduler determines which
interaction occurs at each time. In this paper, we consider a uniformly random scheduler
? = ?0, ?1, . . . : each ?t ? E is a random variable such that Pr(?t = (u, v)) = 1/m for
any t ? 0 and any (u, v) ? E. Given an initial configuration C0 and ?, the execution of
protocol P is defined as ?P (C0, ?) = C0, C1, . . . such that Ct ??t Ct+1 for all t ? 0. We
denote ?P (C0, ?) simply by ?P (C0) when no confusion occurs.
The leader election problem requires that every agent should output L or F which
means ?leader? or ?follower? respectively. We say that a finite or infinite sequence of
configurations ? = C0, C1, . . . preserves a unique leader, denoted by ? ? LE, if there exists
v ? V such that O(Ct(v)) = L and O(Ct(u)) = F for any t ? 0 and u ? V \ {v}. For
? = C0, C1, . . . , the holding time of the leader HT(?, LE) is defined as the maximum t ? N
that satisfies (C0, C1, . . . , Ct?1) ? LE. We define HT(?, LE) = 0 if C0 ?/ LE. We denote
E[HT(?P (C), LE)] by EHTP (C, LE). Intuitively, EHTP (C, LE) is the expected number
of interactions for which the population keeps the unique leader after protocol P starts
from configuration C. For configuration sequence ? = C0, C1, . . . and a set of configurations
C, we define convergence time CT(?, C) as the minimum t ? N that satisfies Ct ? C. We
define CT(?, C) = ? if Ct ?/ C for any t ? 0, where ? is the length of ? (i.e. the number
of configurations). We denote E[CT(?P (C), C)] by ECTP (C, C). Intuitively, ECTP (C, C) is
the expected number of interactions by which the population reaches a configuration in C
when starting from C.
maxC?Call(P ) ECTP (C, S) ? ? and minC?S EHTP (C, LE) ? ?.
I Definition 1 (Loosestabilizing leader election [13]). Protocol P (Q, Y, T, O) is an (?,
?)looselystabilizing leader election protocol if there exists set S of configurations satisfying
Chernoff Bounds
Two variants of Chernoff bounds [11] used in several proofs of this paper are quoted below.
I Lemma 2 (Eq. (4.2) in [11]). The following inequality holds for any binomial random
variable X and any ?, 0 < ? ? 1:
Pr(X ? (1 + ?)E[X]) ? e??2E[X]/3.
Pr(X ? (1 ? ?)E[X]) ? e??2E[X]/2.
I Lemma 3 (Eq. (4.5) in [11]). The following inequality holds for any binomial random
variable X and ?, 0 < ? ? 1:
3
Looselystabilizing Leader Election Protocol
This section presents looselystabilizing leader election protocol PAR for arbitrary undirected
anonymous graphs without identifiers or random numbers. Symmetry breaking is not a key
issue to elect a leader in the population protocol model since random scheduler breaks the
symmetry of the population. (Globalfairness breaks the symmetry in the case of deterministic
scheduler.) The challenging issue is to reduce the number of leaders to one while avoiding to
remove all leaders from the population. Protocol PAR solves this issue without identifiers
or random numbers by viruspropagation and tokencirculation. A leader tries to kill other
leaders by creating and propagating a virus while a circulating token controls the frequency
of creating a virus so that eventually exactly one agent remains a leader (i.e. survives a
virus war).
Protocol PAR is described in Protocol 1. A state of an agent is described by a collection
of variables, and a transition function is described by a pseudo code that updates variables
of initiator x and responder y. We denote the value of variable var of agent v ? V by v.var.
We also denote the value of var in state q ? Q by q.var. In PAR, each agent has three
binary variables leader ? {>, ?}, token ? {>, ?} and color ? {BLACK, WHITE}, and
four timers timerL, timerT, timerV and timerE. The output function defines leaders based
on variable leader : agent v is a leader if v.leader = >, and a follower otherwise. We say
that agent v has a token if v.token = > and v has a virus if v.timerV > 0. We also say that
v is black if v.color = BLACK, and v is white otherwise.
Protocol PAR consists of five parts: leadercreation (Lines 1?7), tokencreation (Lines
8?14), tokencirculation (Lines 15?20), viruscreation (Lines 29?37), and viruspropagation
(Lines 21?28). Our goal is to elect a unique leader in the population from an arbitrary
initial configuration. The leadercreation part creates a leader when no leader exists in the
population. The other four parts work together to reduce the number of leaders to one when
two ore more leaders exist.
The leadercreation part aims to create a leader when no leader exists in the population.
Each agent uses timerL as the barometer for suspecting that there exists no leader. Specifically,
when initiator x and responder y interact, they take the larger value of x.timerL and y.timerL,
decrease it by one, and substitute the decreased value into x.timerL and y.timerL (Line 1).
We call this event larger value propagation. If x or y is a leader, both timers are reset to tmax
(Lines 2?3). We call this event timer reset. When a timer becomes zero (i.e. timeout), agents
x and y suspect that there exists no leader in the population. Then, x becomes a new leader
with the full timer value tmax (Lines 5?6). When no leader exists, the population never
experiences timer reset, thus, their timers keep on decreasing. Hence, the timeout eventually
Algorithm 1 Leader Election PAR
Variables of each agent:
leader ? {>, ?}, token ? {>, ?}, color ? {BLACK, WHITE}
timerL ? [0, tmax], timerT ? [0, tmax], timerV ? [0, tvirus], timerE ? [0, tepi]
Output function O:
if v.leader = > holds, then the output of agent v is L, otherwise F .
Interaction between initiator x and responder y:
// a leader resets leader timer
// a new leader is created at timeout
// a token resets token timer
// a new token is created at timeout
// a token moves between agents
// decrement and swap epidemic timers
1: x.timerL ? y.timerL ? max(x.timerL ? 1, y.timerL ? 1, 0)
2: if x.leader = > or y.leader = > then
3: x.timerL ? y.timerL ? tmax
4: else if x.timerL = 0 then
5: x.leader ? >
6: x.timerL ? y.timerL ? tmax
7: end if
8: x.timerT ? y.timerT ? max(x.timerT ? 1, y.timerT ? 1, 0)
9: if x.token = > or y.token = > then
10: x.timerT ? y.timerT ? tmax
11: else if x.timerT = 0 then
12: x.token ? >
13: x.timerT ? y.timerT ? tmax
14: end if
15: x.token ? y.token
16: x.timerE ? max(0, y.timerE ? 1)
17: y.timerE ? max(0, x.timerE ? 1)
18: if x.token = > and y.token = > then
19: y.token ? ?
20: end if
21: if x.timerV > 0 and y.timerV = 0 and x.color 6= y.color then
22: y.leader ? ?
23: y.color ? x.color
24: else if x.timerV = 0 and y.timerV > 0 and x.color 6= y.color then
25: x.leader ? ?
26: x.color ? y.color
27: end if
28: x.timerV ? y.timerV ? max(x.timerV ? 1, y.timerV ? 1, 0)
29: if x.leader = > and x.token = > and x.timerE = 0 then
30: if x.color = BLACK then x.color ? WHITE else x.color ? BLACK endif
31: x.timerV ? tvirus
32: x.timerE ? tepi
33: else if y.leader = > and y.token = > and y.timerE = 0 then
34: if y.color = BLACK then y.color ? WHITE else y.color ? BLACK endif
35: y.timerV ? tvirus
36: y.timerE ? tepi
37: end if
occurs and a leader is created. When a leader exists, the timeout rarely happens since all
agents keep high timer values thanks to the timer reset and the larger value propagation.
Therefore, this mechanism rarely ruins stability of the unique leader.
Protocol PAR reduces the number of leaders to one as follows. The tokencreation part
creates a token when no token exists in the population; The tokencirculation part reduces the
number of tokens to one, circulates the unique token among the population, and decrements
the epidemic timer (timerE) of the unique token every time it moves; The viruscreation
part creates a new virus when a leader meets a token with epidemic timer of value zero; The
viruspropagation part propagates the virus to the whole population, which changes leader
agents to follower agents.
The tokencreation part (Lines 8?14) creates a token in the same way as the
leadercreation part when no token exists in the population. There is no difference between the two
parts except that the former uses variable timerT while the latter uses timerL.
The tokencirculation part (Lines 15?20) aims to reduce the number of tokens to one,
and circulates the unique token. A token moves between agents by interaction (Line 15).
We can say that a token makes a random walk among the population since the scheduler
randomly chooses two agents to interact at each time. Hence, two tokens eventually meet if
two or more tokens exist in the population. When two agents interact and both agents have
tokens, then either one of the two loses its token (Lines 18?20). Hence, the number of tokens
eventually becomes one. Each token has an epidemic timer (timerE). The epidemic timer is
decremented by one every time the token moves, and thus, it becomes zero eventually (Line
16?17). Note that the number of tokens never becomes zero once a token exists since the
number of tokens decreases only when two tokens meet at an interaction.
A viruscreation part (Lines 29?37) creates a new virus when a leader meets a token with
an epidemic timer of value zero. We call this event virus creation. Specifically, if a token
with timerE = 0 moves to a leader agent, the leader changes its color from black to white or
from white to black (Lines 30 and 34) and creates a new virus with full value TTL (Time To
Live), i.e. timerV = tvirus (Lines 31 and 35). The leader also resets the epidemic timer of the
token (Lines 32 and 36), which enables periodical occurrence of epidemics.
A viruspropagation part (Lines 21?28) propagates a virus from agent to agent and
reduces the number of leaders. When an agent has a virus (i.e. v.timerV > 0), we regard that
v.timerV is the TTL of the virus. A virus vanishes from the agent when its TTL becomes
zero. In the same way as timerL and timerT, a virus propagates at interaction in the larger
value propagation fashion (Line 28). Moreover, a virus has the power to change the colors
of agents and kill leaders. Specifically, if an agent with a virus interacts an agent without
a virus, the virus changes the color of the newly infected agent (Lines 23 and 26). At this
time, if the newly infected agent is a leader, the virus kills the leader (i.e. changes the newly
infected agent from a leader to a follower). Once a new virus is created at the viruscreation
part, the virus propagates to the whole population within a short time. However, the value
of timerV is reset only when a new virus is created. Hence, viruses eventually vanish from
the population if the frequency of epidemics, controlled by the value tepi, is sufficiently low.
The concept of colors helps to avoid the suicide of leaders, i.e. a leader is rarely killed by a
virus that it creates. Consider that a white leader creates a virus. After that, the leader and
any infected agent with the virus are black, thus the leader is never killed by the virus until
another virus is created and the leader becomes white.
Protocol PAR correctly works if tmax and tvirus is sufficiently large and tepi is sufficiently
greater than tvirus. When there exists no leader, the leadercreation part eventually creates a
leader by timeout. In the following, let us consider the case that multiple leaders exist in the
population, and see how PAR reduces these leaders to one. The tokencreation and the token
circulation parts eventually create the unique token and circulate it in the population. Since
tepi is sufficiently greater than tvirus, the population eventually reaches a configuration where
no agent has virus. After that, the epidemic timer of the token keeps on decreasing and
eventually becomes zero, and the token eventually moves to a leader in the population, which
creates a new virus. This virus soon propagates among the whole population and turn all
the agents to the ones with the same color (black or white). Let the color be black without
loss of generality. Again, the virus vanishes, the epidemic timer of the token becomes zero,
and the token moves to a leader in the same way. Then, the black leader becomes white and
creates a new virus. It soon propagates to the whole population and changes all agents from
black to white, which kills all other leaders. Then, we have the exactly one leader in the
population.
Even after we have exactly one leader and one token, the population sometimes enters
the wrong configuration where no leader exists, multiple leaders exist, or multiple tokens
exist. These deviations are caused by the following events: (i) leader timeout happens, (ii)
token timeout happens, or (iii) a new virus is created when viruses remain in the population.
Cases (i) and (ii) rarely happens thanks to the timer reset, the larger value propagation, and
the sufficiently large tmax, which is the reset value of leader timers and token timers. Case
(iii) also rarely happens because tepi, the reset value of the epidemic timer, is sufficiently
larger than the reset value of a virus timer tvirus. As we shall see later, the expected time
from a safe configuration to such a wrong configuration is exponential.
4
Complexity Analysis
This section analyzes the expected holding time and the expected convergence time of PAR.
Due to the lack of space, we present only proof sketches for the analyses of the expected
convergence time. Complete proofs are left to the full paper. Notations and assumptions
used in this paper are summarized in Table 2.
We have three parameters in PAR: the reset values of timers tmax, tvirus, and tepi.
We mentioned that PAR correctly works if tmax and tvirus is sufficiently large and tepi is
sufficiently greater than tvirus. Specifically, we assume tmax ? 8? max(d, d2 log mn3de),
tvirus = tmax/2, and tepi ? 4?tmaxdlog ne where ? is the maximum degree of the agents and d
is the diameter of population G. (Note that ? is an even number because G is undirected, i.e.
(u, v) ? E ? (v, u) ? E.) We also assume that tepi is not extremely large: tepi ? ? e? /(9n)
where ? = btmax/(8?)c. Otherwise, even if a leader exists, the leader timeout happens with
nonnegligible probability within an exponentially long epidemic interval. This means that
the protocol may not reduce the number of leaders to one at the convergence step. We also
assume n ? 3 because PAR is obviously a selfstabilizing leader election protocol when n = 2.
In the rest of this section, we prove the following equations under these assumptions:
maxC?Call ECTPAR (C, SAR) = O(mn3d + mtepi),
minC?SAR EHTPAR (C, LE) = ?(? e? ),
(1)
(2)
where SAR is the set of configurations we define later. When upper bounds N and ? of n
and ? are available and we assign tmax = 8? max(N, d12 log N e), tepi = 4?tmaxdlog N e, then
PAR is an (O(mn3d + mN ?2 log N ), ?(N eN ))looselystabilizing leader election protocol.
(Note that this assignment satisfies the above assumptions.)
Lone :
Tone :
Lexist :
Texist :
Lhalf :
Thalf :
Vsame :
Vzero :
Ehalf :
SAR :
PROPL(i) :
PROPT (i) :
HALF(i) :
#TI (v, t1, t2) :
Notations
btmax/(8?)c
the number of leaders in configuration C
the number of tokens in configuration C
n ? 3
tmax ? 8? max(d, d2 log mn3de)
tvirus = tmax/2
4?tmaxdlog ne ? tepi ? ? e? /(9n)
Before proving equations (1) and (2), we define ten sets of configurations:
Lone = {C ? Call(PAR)  #L(C) = 1},
Tone = {C ? Call(PAR)  #T (C) = 1},
Lexist = {C ? Call(PAR)  #L(C) ? 1},
Texist = {C ? Call(PAR)  #T (C) ? 1},
Lhalf = {C ? Call(PAR)  ?v ? V, C(v).timerL > tmax/2},
Thalf = {C ? Call(PAR)  ?v ? V, C(v).timerT > tmax/2},
Vsame = {C ? Call(PAR)  ?u, ?v ? V, C(u).leader = >
? (C(v).timerV > 0 ? C(u).color = C(v).color)},
Vzero = {C ? Call(PAR)  ?v ? V, C(v).timerV = 0},
Ehalf = {C ? Call(PAR)  ?v ? V, C(v).token = > ? C(v).timerE > tepi/2},
SAR = Lone ? Tone ? Lhalf ? Thalf ? Vsame ? (Ehalf ? Vzero)
where #L(C) and #T (C) denote the number of leaders and tokens in configuration C,
respectively. Note that Vsame is the set of configurations where there exists a leader agent
such that every agent with a virus has the same color as the leader, and Ehalf is the set of
configurations where every token has the epidemic timer whose value is greater than tepi/2.
First, we analyze the expected holding time. Let C0 ? SAR and ?PAR (C0) = C0, C1, . . . .
To prove (2), it is sufficient to show that both (i) C0, . . . , C8m??dlog ne ? LE and (ii) C8m??dlog ne
? SAR hold with probability no less than psuc = 1 ? O(n? log n ? e?? ). Then, letting
A = minC0?SAR EHTPAR (C0, LE), we have A ? 8m?? dlog nepsuc/(1 ? psuc) = ?(? e? ), since
A ? psuc(8m?? dlog ne + A). We give five conditions such that satisfying all the conditions
leads to above conditions (i) and (ii) (Lemma 10). After that, we analyze the probability that
all the five conditions hold and prove that the probability is no less than 1 ? O(n? log n ? e?? ).
We define three predicates PROPL(i), PROPT (i) and HALF(i) for any i ? 0: PROPL(i) =
1 if C2m?(i+1) ? Lhalf or C2m?i ?/ Lexist, otherwise PROPL(i) = 0; PROPT (i) = 1 if
C2m?(i+1) ? Thalf or C2m?i ?/ Texist, otherwise PROPT (i) = 0; HALF(i) = 1 if every agent
joins less than tmax/2 interactions among ?2m?i, . . . , ?2m?(i+1)?1, otherwise HALF(i) = 0.
Intuitively, PROPL(i) = 1 (PROPT (i) = 1) means that high value of timerL (timerT)
propagates from a leader (a token, respectively) to all the agents during 2m? interactions, and
HALF(i) = 1 means every agent does not interact so much during 2m? interactions. Note that
PROPL(i) = 1 (PROPT (i) = 1) unconditionally holds when there exists no leader (token,
respectively) in C2m?i. In addition, we define binary random variable TOL(C0, t1, t2) and
TOT (C0, t1, t2) for integers t1 and t2, (0 ? t1 ? t2) as follows: TOL(C0, t1, t2) = 1 if there
exists integer i (t1 ? i < t2) satisfying #L(Ci) < #L(Ci+1), otherwise TOL(C0, t1, t2) = 0;
TOT (C0, t1, t2) = 1 if there exists integer i (t1 ? i < t2) satisfying #T (Ci) < #T (Ci+1),
otherwise TOT (C0, t1, t2) = 0. Intuitively, variable TOL(C0, t1, t2) (variable TOT (C0, t1, t2))
represents whether an interaction among ?t1 , . . . , ?t2?1 trigger the leader timeout (the token
timeout, respectively) or not.
I Lemma 4. Let C0 ? Lhalf ? Lexist and ?PAR (C0) = C0, C1, . . . . We have C2m? ? Lhalf
and TOL(C0, 0, 2m? ) = 0 if PROPL(0) = HALF(0) = 1.
Proof. Since there exists a leader in C0, PROPL(0) = 1 assures C2m? ? Lhalf .
Assumptions C0 ? Lhalf and HALF(0) = 1 assures that the leader timeout does not happen by
?0, . . . , ?2m??1. J
I Corollary 5. Let C0 ? Lhalf and ?PAR (C0) = C0, C1, . . . . Let k ? 1 be any integer.
We have C2m?k ? Lhalf and TOL(C0, 0, 2m? k) = 0 if PROPL(i) = HALF(i) = 1 and
C2m?i ? Lexist hold for all i = 0, 1, . . . , k ? 1.
Once a token exist in the population, the number of tokens never become zero after that.
Hence, we have a simpler lemma as for the token timeout.
I Lemma 6. Let C0 ? Thalf ? Texist and ?PAR (C0) = C0, C1, . . . . Let k ? 1 be any integer.
We have C2m?k ? Thalf ? Texist and TOT (C0, 0, 2m? k) = 0 if PROPT (i) = HALF(i) = 1
holds for all i = 0, 1, . . . , k ? 1.
For agent v ? V and integers t1 and t2, (0 ? t1 < t2), we define #TI (v, t1, t2) = {t ?
[t1 + 1, t2]  vt 6= vt?1} where vt1 = v, and
?
?u
?
vt = ?w
?
??vt?1. otherwise
if ?t?1 = (u, vt?1)
if ?t?1 = (vt?1, w)
for t > t1. Random variable #TI (v, t1, t2) has a intuitive meaning if v has a token when
interaction ?t occurs: Intuitively, #TI (v, t1, t2) represents the number of interactions that
the token involves during ?t1 , . . . , ?t2?1 (or the number of times the token moves during the
period).
I Lemma 7. Let C0 ? SAR and ?PAR (C0) = C0, C1, . . . . Let vT be the agent that has
the unique token in configuration C0, and t ? 0 be a nonnegative integer. Then, we have
Ci ? Vsame for all i = 0, 1, . . . , t if we have #TI (vT , 0, t) < tepi/2 and Ci ? Tone for all
i = 0, 1, . . . , t.
Proof. Let vL be the unique leader in configuration C0, and we assume that the color of
vL and all agents with viruses are black without loss of generality (Note that C0 ? Vsame).
Since C0 ? Ehalf ? Vzero, we prove the lemma for two cases C0 ? Ehalf and C0 ? Vzero. In
case C0 ? Ehalf , the epidemic timer of the unique token never becomes zero in C0, . . . , Ct
because #TI (vT , 0, t) < tepi/2. Therefore, a new virus is not created during C0, . . . , Ct,
which assures that vl and all agents with viruses are still black in C0, . . . , Ct. Thus, we have
Ci ? Vsame for all i = 1, 2, . . . , t. In case C0 ? Vzero, the virus creation happens at most once
during C0, . . . , Ct because #TI (vT , 0, t), < tepi/2 and Ci ? Tone for all i = 0, 1, . . . , t. If the
virus creation does not happen, Ci ? Vzero ? Lexist ? Vsame holds for all i = 0, 1, . . . , t. If a
leader meets a token with an epidemic timer of value zero and creates a new virus, the virus
propagates from agent to agent. However, the virus makes all infected agents the same color
as the leader that creates the virus, which assures Ci ? Vsame for all i = 0, 1, . . . , t. J
The following lemma is directly obtained from Corollary 5 and Lemma 7.
I Lemma 8. Let C0 ? SAR and ?PAR (C0) = C0, C1, . . . . Let vT be the agent that has
the unique token in configuration C0, and k ? 0 be any integer. Then, we have C2m?k ?
Lhalf ? Vsame and Ci ? Lone for all i = 0, 1, . . . , 2m? k if we have PROPL(j) = HALF(j) = 1
for all j = 0, 1, . . . , k?1, #TI (vT , 0, 2m? k) < tepi/2, and Ci ? Tone for all i = 0, 1, . . . , 2m? k .
We define the first round time RT?(1) as the minimum t satisfying ?e ? E, 0 ? ?t0 ?
t, ?t0 = e. For any i ? 2, we define the ith round time RT?(i) as the minimum t satisfying
?e ? E, RT?(i ? 1) < ?t0 ? t, ?t0 = e.
I Lemma 9. Let C0 ? SAR and ?PAR (C0) = C0, C1, . . . . Let t ? 0 be any integer. We have
Ct ? Ehalf ? Vzero if we have RT?(tvirus) < t, #TI (vT , 0, t) < tepi/2, and #T (Ci) = 1 for all
i = 0, 1, . . . , t.
Proof. If a new virus is not created among ?0, . . . , ?t, then all viruses in the initial
configuration vanish during the period since each round decreases the maximum value of timerV
by at least one. Thus, Ct ? Vzero holds. If some agent v creates a new virus at ?t0 , then
the epidemic timer of the unique token are reset at the same time. (Note that the unique
token always exist in the population by the assumption of the lemma.) Thus, we have
Ct0 (v).timerE = tepi. Since #TI (v, t0, t) ? #TI (vT , 0, t) < tepi/2, the epidemic timer of the
unique token is no less than tepi ? tepi/2 = tepi/2, which means Ct ? Ehalf . J
I Lemma 10. Let C0 ? SAR and ?PAR (C0) = C0, C1, . . . . Let vT be the agent that has
the unique token in configuration C0. Then, we have both C0, . . . , C8m??dlog ne ? LE and
C8m??dlog ne ? SAR if the following conditions hold:
(A) #TI (vT , 0, 8m?? dlog ne) < tepi/2,
(B) PROPL(i) = 1 for all i = 0, 1, . . . , 4?dlog ne ? 1,
(C) PROPT (i) = 1 for all i = 0, 1, . . . , 4?dlog ne ? 1,
(D) HALF(i) = 1 for all i = 0, 1, . . . , 4?dlog ne ? 1, and
(E) RT?(tvirus) < 8m?? dlog ne.
Proof. Assigning k = 4?dlog ne, we obtain C8m??dlog ne ? Thalf and Cj ? Tone for all j =
0, 1, . . . , 8m?? dlog ne by Lemma 6 and Conditions (C) ad (D). From Lemma 8 and Conditions
(A), (B), and (D), the unique token assures that C8m??dlog ne ? Lhalf ? Vsame and Cj ?
Lone holds for j = 0, 1, . . . , 8m?? dlog ne. Note that Cj ? Lone (j = 0, 1, . . . , 8m?? dlog ne)
guarantees not only that the number of leaders is one, but also that the unique leader is stable
(i.e. ?v ? V, ?i ? [0, 8m?? dlog ne], Ci(v).leader = >) because PAR does not move the leader
role from agent to agent at any one interaction. Hence, we have C0, . . . , C8m??dlog ne ? LE.
We have C8m??dlog ne ? Ehalf ? Vzero from Lemma 9, Condition (A), Condition (E), and
Cj ? Tone for all j = 0, 1, . . . , 8m?? dlog ne. Thus, we have shown that C8m??dlog ne ?
Lone ? Tone ? Lhalf ? Thalf ? Vsame ? (Ehalf ? Vzero) ? SAR J
I Lemma 11. Let C0 ? Tone and ?PAR (C0) = C0, C1, . . . . Let vT be the agent that has the
unique token in configuration C0. Then, we have Pr(#TI (vT , 0, 8m?? dlog ne) < tepi/2) ?
1 ? e??? .
Proof. For every i ? 0, the token joins interaction ?i with probability at most ?/m
regardless of the location of the token in Ci because any agent has at most ? edges. Thus,
#TI (vT , 0, 8m?? dlog ne) is bounded by binomial random variable X ? B(8m?? dlog ne, ?/m).
We have
Pr(X ? tepi/2) ? Pr(X ? 16?2? dlog ne)
= Pr(X ? 2E[X])
* tepi ? 32?2? dlog ne
which gives the lemma.
J
I Lemma 12. Pr(PROPL(i) = 1) ? 1 ? 2ne?? for any i ? 0.
Proof. We assume i = 0 without loss of generality, and prove Pr(PROPL(0) = 1) ? 1?2ne?? .
We have PROPL(0) = 1 by the definition of PROPL if no leader exists in C0. Thus, it
suffices to show Pr(C2m? (v).timerL > tmax/2) ? 1 ? 2e?? for any agent v ? V in case
C0 ? Lexist. Let vL be a leader agent in C0. We denote the shortest path from vL to v by
(v0, v1, . . . , vs) where v0 = vL, vs = v, 0 ? s ? d and (vj?1, vj) ? E for all j = 1, 2, . . . , s.
For any t = 0, 1, . . . , 2m? , we define vhead(t) as vh with maximum h ? [1, s] such that there
exist t1, t2, . . . , th satisfying 0 ? t1 < t2 < ? ? ? < th < t and ?tj ? {(vj?1, vj), (vj, vj?1)} for
j = 1, 2, . . . , h. We define vhead(t) = v0 if such h does not exist. Intuitively, vhead(t) is the
head of the agents in path (v0, v1, . . . , vl) to which a large value of timerL is propagated
from vL to v. (Remind that vL resets timerL to tmax.) We define J (t) as the number of
integers j ? [0, . . . , 2m? ? 1] such that vhead(j) joins interaction ?j. Intuitively, J (t) is the
number of interactions that the head agent joins among ?0, . . . , ?2m??1. Obviously, we have
Ct(vhead(t)).timer ? tmax ? J (t) for any t = 0, 1, . . . , 2m? .
In what follows, we prove Pr(vhead(2m? ) = v) ? 1 ? e?? and Pr(J (2m? ) < tmax/2) ?
1 ? e?? , which give Pr(C2m? (v).timer > tmax/2) ? 1 ? 2e?? . For any j = 1, . . . , s, a pair
vj?1 and vj interacts with probability 2/m at each interaction. Hence, we can say each
interaction makes vhead forward with probability 2/m. Therefore, by letting Z be a binomial
random variable such that Z ? B(2m?, 2/m), we have
Pr(vhead(2m? ) = v) = 1 ? Pr(Z < s)
The probability that vhead(t) joins interaction ?t is at most ?/m regardless of t. Hence, by
letting Z0 be a binomial random variable such that Z0 ? B(2m?, ?/m), we have
Pr(J (2m? ) < tmax/2) > 1 ? Pr(Z0 ? tmax/2)
Thus, we have shown Pr(C2m? (v).timerL > tmax/2) ? 1 ? 2e?? .
I Lemma 13. Pr(PROPT (i) = 1) ? 1 ? 2ne?? for any i ? 0.
Proof. The same argument as the proof of Lemma 12 gives the lemma.
I Lemma 14 (in [14]). The probability that every v ? V interacts only less than tmax/2
times during 2m? interactions is at least 1 ? ne?? .
Proof. For any v ? V and i ? 0, v joins interaction ?i with probability at most ?/m. Thus,
the number of interactions v joins during the 2m? interactions is bounded by binomial
random variable X ? B(2m?, ?/m). Applying Chernoff bound of Lemma 2 with ? = 1, we
have
Pr(X ? tmax/2) ? Pr(X ? 2E[X])
* tmax ? 8??
Summing up the probabilities for all v ? V gives the lemma.
I Lemma 15 (in [14]). Pr(HALF(i) = 1) ? 1 ? ne?? for any i ? 0.
Proof. Each interaction is independent. Thus, Lemma 14 gives the lemma.
I Lemma 16 (in [14]). Pr(RT?(i) < im(1 + dlog ne)) ? 1 ? ne?i/4 holds for any i ? 1.
Proof. The proof in [14] can be used with slight modification.
I Lemma 17. Pr(RT?(tvirus) < 8m?? dlog ne) ? 1 ? ne??(?+1) holds.
J
J
J
J
J
Proof. By Lemma 16, we have
Pr(RT?(tvirus) < 8m?? dlog ne) ? Pr(RT?(4?(1 + ? )) < 8m?? dlog ne)
? Pr(RT?(4?(1 + ? )) < 4m?(1 + ? )(1 + dlog ne))
? 1 ? ne??(?+1)
where we use tvirus ? 4?(1 + ? ) for the first inequality, and use (1 + ? )(1 + dlog ne) ? 2? dlog ne
when ? ? 3 and n ? 3 for the second inequality. (Note that ? ? d2 log mn3de ? 10.) J
I Lemma 18. minC?SAR EHTPAR (C, LE) = ?(? e? ).
Proof. Probability psuc, discussed in the beginning of this section, is at least 1 ? e??? ?
4?dlog ne(2ne?? + 2ne?? + ne?? ) ? ne??(?+1) ? 1 ? 22n?dlog nee?? by Lemmas 10, 11, 12,
13, 15, and 17, which leads to the lemma. J
Next, we analyze the expected convergence time.
I Lemma 19. maxC?Call ECTPAR (C, SAR) = O(mtepi + mn3d).
Proof Sketch. In an execution of PAR , the population converges to SAR starting from any
configuration through the following convergence steps: (i) a token is created even when no
token exists, (ii) the number of tokens become one, i.e. the unique token is elected, (iii) all
viruses vanish from the population, (iv) the epidemic timer of the unique token becomes zero,
(v) the unique token meets a leader and a new virus is created, (vi) a newly created virus
propagates to the whole population and changes all agents to the ones with the same color
(Let the color be black without loss of generality), (vii) the epidemic timer of the unique
token becomes zero, (viii) the unique token meets a leader and a new virus is created, (ix) a
newly created virus propagates to the whole population and makes all agents white, which
kills all leaders other than the leader that creates the virus, and the population enters SAR.
Steps (ii), (iv) and (vii) require the dominant number of interactions. We will prove that
the expected number of interactions until two tokens meet is O(mn2d) in Lemma 20. The
number of tokens is at most n, and the token timeout, which is the only event that increases
the number of tokens, rarely happens once a token exists. Hence, the expected number of
interactions Step (ii) requires is O(mn3d). The expected number of interactions Step (iv) and
(vii) require is O(mtepi) because the epidemic timer decreases by one as the token joins an
interaction, and the unique token joins each interaction ?t with probability at least 2/m. J
I Lemma 20. Let C0 be a configuration where two or more tokens exist. In execution
?PAR (C0), the expected number of interactions until two tokens meet is at most mn2d/2.
Proof. Let u, v ? V be the distinct two agents both of which have tokens in C0. We analyze
the expected number of interactions until the two tokens meet. (One of the two tokens
may vanish by meeting another token, however, this just reduces the expected number of
interactions until any two tokens meet.) Consider the pair of random walks by the two tokens
on population G, i.e. a Markov chain (ut, vt) in which the states of the chain are pairs of the
agents in G. We denote (a, b) ? (c, d) for agents a, b, c, d ? V if (a, c) ? E ? b = d, or (b, d) ?
E ? a = c, or (a, b) ? E ? a = d ? b = c. For any two states x and y, the transition probability
Px,y of the chain is given by Px,y = 2/m if x ? y, Px,y = 1 ? (2/m){z  x ? z} if x = y,
otherwise Px,y = 0. The symmetry structure of the chain (Px,y = Py,x) gives Px Px,y = 1
for all state y. Thus, ? = (?(x1), ?(x2), . . . , ?(xn(n?1))) = {n(n ? 1)}?1(1, 1, . . . , 1) is the
stationary distribution of the chain (?P = ?) where x1, x2, . . . , xn(n?1) are all the states of
the chain (i.e. all pairs of token locations). We denote the expected number of transition steps
from state x to state y by hx,y. We have hy,y = 1/?(y) = n(n ? 1) for any state y. We also
have hy,y = 1 + Py?z(2/m) ? hz,y. Hence, we obtain Py?z hz,y = n(n ? 1)m/2 ? m/2. Thus,
we have hx,y ? mn2/2 for any states x and y satisfying x ? y. Let w0, w1, . . . , wl (w0 =
u, wl = v, l ? d) be the shortest path from u to v. The expected time until the two token
meet is bounded by J
Pli?=20 h(wi,wl),(wi+1,wl) h(wl?1,wl),(wl,wl?1) ? mn2d/2.
Lemmas 18 and 19 gives the following theorem.
I Theorem 21. Protocol PAR is an (O(mtepi + mn3d), ?(? e? )) looselystabilizing leader
election protocol for arbitrary graphs G when tmax ? 8? max(d, d2 log mn3de), tvirus = tmax/2,
and 4?tmaxdlog ne ? tepi ? ? e? /(9n).
Therefore, given an upper bounds N of n and upper bound ? of ?, we have a (O(mn3d +
mN ?2 log N ), ?(N eN )) looselystabilizing leader election protocol for arbitrary graphs if we
assign tmax = 8? max(N, d12 log N e), tvirus = tmax/2, tepi = 4?tmaxdlog N e.
5
Conclusion
We have presented a looselystabilizing leader election protocol for arbitrary undirected graphs
in the population protocol model. It does not use agent identifiers nor random numbers
unlike our previous protocols. Given upper bounds N of n and ? of ?, the population reaches
a safe configuration within O(mn3d + mN ?2 log N ) expected interactions, and after that,
keeps a unique leader for ?(N eN ) expected interactions. The restriction to undirected graph
is only for simplicity of complexity analysis, and PAR works on arbitrary directed graphs
without modifications.
1
2
3
4
5
6
7
8
9
10
D. Angluin , J Aspnes , Z. Diamadi , M.J. Fischer , and R. Peralta . Computation in networks of passively mobile finitestate sensors . Distributed Computing , 18 ( 4 ): 235  253 , 2006 . doi: 10 .1007/s0044600501383.
D. Angluin , J. Aspnes , and D. Eisenstat . Fast computation by population protocols with a leader . In DISC , pages 61  75 , 2006 .
ACM Transactions on Autonomous and Adaptive Systems , 3 ( 4 ): 13 , 2008 .
J. Beauquier , P. Blanchard , and J. Burman . Selfstabilizing leader election in population protocols over arbitrary communication graphs . In OPODIS , pages 38  52 , 2013 .
In OPODIS , pages 61  75 , 2012 .
S. Cai , T. Izumi , and K. Wada . How to prove impossibility under global fairness: On space complexity of selfstabilizing leader election on a population protocol model . Theory of Computing Systems , 50 ( 3 ): 433  445 , 2012 .
D. Canepa and M. G. PotopButucaru . Stabilizing leader election in population protocols , 2007 . URL: http://hal.inria.fr/inria00166632.
M. J. Fischer and H. Jiang . Selfstabilizing leader election in networks of finitestate anonymous agents . In OPODIS , pages 395  409 , 2006 . doi: 10 .1007/11945529_ 28 .
T. Izumi . On space and time complexity of looselystabilizing leader election . In SIROCCO , 2015 .
O. Michail , I. Chatzigiannakis , and P. G. Spirakis . Mediated population protocols . Theoretical Computer Science , 412 ( 22 ): 2434  2450 , 2011 .
M. Mitzenmacher and E. Upfal . Probability and Computing: Randomized Algorithms and Probabilistic Analysis . Cambridge University Press, 2005 .
R. Mizoguchi , H. Ono , S. Kijima , and M. Yamashita . On space complexity of selfstabilizing leader election in mediated population protocol . Distributed Computing , 25 ( 6 ): 451  460 , 2012 .
Y. Sudo , J. Nakamura , Y. Yamauchi , F. Ooshita , H. Kakugawa , and T. Masuzawa . Looselystabilizing leader election in a population protocol model . Theoretical Computer Science , 444 : 100  112 , 2012 .
Y. Sudo , F. Ooshita , H. Kakugawa , and T. Masuzawa . Looselystabilizing leader election on arbitrary graphs in population protocols . In OPODIS , pages 339  354 . Springer, 2014 .
X. Xu , Y. Yamauchi , S. Kijima , and M. Yamashita . Space complexity of selfstabilizing leader election in population protocol based on kinteraction . In SSS , pages 86  97 , 2013 .