#### Quantifier Alternation in Two-Variable First-Order Logic with Successor Is Decidable

S TA C S '
Quantifier Alternation in Two-Variable First-Order Logic with Successor Is Decidable?
Manfred Kufleitner 0
Alexander Lauser 0
0 University of Stuttgart , FMI , Germany
We consider the quantifier alternation hierarchy within two-variable first-order logic FO2[<, suc] over finite words with linear order and binary successor predicate. We give a single identity of omega-terms for each level of this hierarchy. This shows that for a given regular language and a non-negative integer m it is decidable whether the language is definable by a formula in FO2[<, suc] which has at most m quantifier alternations. We also consider the alternation hierarchy of unary temporal logic TL[X, F, Y, P] defined by the maximal number of nested negations. This hierarchy coincides with the FO2[<, suc] quantifier alternation hierarchy. 1998 ACM Subject Classification F.4.1 Mathematical Logic, F.4.3 Formal Languages. Around 1960, B?chi, Elgot and Trakhtenbrot independently showed that monadic secondorder logic (MSO) over finite words defines the class of regular languages [2, 6, 33]. Since then numerous fragments of MSO have been considered. A theoretical motivation for fragments is the study of the rich structure within the regular languages. For this purpose, fragments form the basis of a descriptive complexity theory: The simpler the formula for defining a language is, the simpler this language is. From a practical point of view, simpler fragments often lead to more efficient algorithms for decision problems such as satisfiability. The most prominent fragment of MSO is first-order logic FO. The atomic predicates of FO are the unary predicate ?(x) = a stating that position x is labeled by the letter a, and the binary predicates x = y and x < y with the natural interpretation. The successor predicate suc(x, y) is easily definable in FO by saying that x < y and that there is no position between x and y. McNaughton and Papert showed that a language is FO-definable if and only if it is star-free [18]. Combined with Sch?tzenberger's characterization of star-free languages in terms of finite aperiodic monoids [21], it follows that a language is FO-definable if and only if its syntactic monoid is aperiodic. The latter property is decidable and one can thus effectively check whether a regular language (given e.g. by a nondeterministic automaton or an MSO formula) is definable in FO. The two most famous hierarchies within FO are the Straubing-Th?rien hierarchy and Brzozowski's dot-depth hierarchy. The Straubing-Th?rien hierarchy coincides with the quantifier alternation inside FO without the successor predicate [25, 29], and Brzozowski's dot-depth hierarchy is captured by quantifier alternation including the successor predicate [3]; see also [20, 31]. Here, quantifier alternation is defined in terms of blocks of quantifiers for formulae in prenex normal form. Note that
and phrases automata theory; semigroups; regular languages; first-order logic
Introduction
? The authors were supported by the German Research Foundation (DFG) under grant DI 435/5-1.
by introducing new variables, every formula is equivalent to a formula in prenex normal
form. Deciding membership of level m for these hierarchies is one of the most challenging
open problems in automata theory. To date only the very first levels (i.e., m = 1) of both
hierarchies are known to be decidable [9, 24].
By Kamp?s Theorem, first-order logic FO3 with only three different names for the
variables and full first-order logic FO have the same expressive power [8]. However, two
variables are not sufficient for defining all first-order definable languages. The fragment
FO2[<] without successor predicate has a huge number of different characterizations; see
e.g. [5, 28]. One of them is the variety DA of finite monoids [22]; cf. [
30
]. For quantifier
alternation inside FO2 one cannot readily rely on prenex normal forms. However, in FO2
negations can be moved towards the atomic formulae, and hence every formula is equivalent
to a negation-free counterpart. The fragment FO2m consists of all FO2-formulae whose
negation-free counterpart has at most m blocks of quantifiers on each path of the parse
tree. Kufleitner and Weil have shown that for every m ? 1 it is decidable whether a given
regular language is definable in FO2m[<] without successor predicate [16]. They have given
an effective algebraic characterization in terms of levels of the Trotter-Weil hierarchy of
finite monoids [34]; see also [15]. In addition, restrictions of many other characterizations of
the FO2[<]-definable languages admit algebraic counterparts within this hierarchy [12, 17].
The proof of Kufleitner and Weil?s characterization of FO2m[<] relies on a combinatorial
tool known under the terms ranker [35] and turtle program [23]. A connection between
FO2m[<] and rankers was established by Weis and Immerman [35] and further exploited by
Kufleitner and Weil [17]. Straubing has given another algebraic characterization of FO2m[<]
in terms of weakly iterated block products of J -trivial monoids [27]. Recently, Krebs and
Straubing [10] were able to use this characterization for giving identities of omega-terms for
FO2m[<], thereby obtaining another effective characterization of FO2m[<].
In this paper, we consider the quantifier alternation hierarchy inside FO2[<, suc] with
successor predicate. The logic FO2[<, suc] is strictly more expressive than FO2[<] without
successor. Th?rien and Wilke [
30
] have given an algebraic characterization of FO2[<, suc]
which, by a previous result of Almeida, is known to coincide with the decidable variety LDA
of finite semigroups [1]; see also [4]. For every m ? 2 we give a single identity of omega-terms
such that a language is definable in FO2m[<, suc] if and only if its syntactic semigroup satisfies
this identity. It is thus decidable whether a given regular language is FO2m[<, suc]-definable.
Our proof is by induction on m with Knast?s Theorem on dot-depth one [9] as base case.
For m = 1, there is a small difference between the availability and the absence of min- and
max-predicates; this is identical to the situation for dot-depth one [11]. The main ingredients
of our proof are (i) string rewriting techniques, (ii) combinatorial properties of LDA, and
2 . As a byproduct, we show that quantifier alternation
(iii) relativization techniques for FOm
in FO2[<, suc] coincides with alternation in unary temporal logic TL[X, F, Y, P] where the
latter is based on the nesting depth of negations. This last property can also be seen using a
translation from FO2 to unary temporal logic by Etessami, Vardi, and Wilke [7].
Missing proofs can be found in the technical report [14].
2
Preliminaries
Throughout, A denotes a finite alphabet. The set of all finite words is A? and the set of all
finite, nonempty words is A+. Let u = a1 ? ? ? an with ai ? A. The set of positions of u is
pos(u) = {1, . . . , n} and its length is |u| = n. If I is an interval, then u[I] denotes the factor
of u covered by the interval of positions pos(u) ? I. If I = [i; j], then u[i; j] is an abbreviation
for u[I]. In particular, if 1 ? i ? j ? n, then u[i; j] = ai ? ? ? aj . The k-factor alphabet is
alphk(u) = {ai ? ? ? ai+k?1 ? Ak | 1 ? i ? n ? k + 1}
First-Order Logic. We consider first-order logic over finite words with order and successor
predicates. Atomic first-order formulae are > for true, ? for false, label predicates ?(x) = a
with a ? A, comparisons x = y, x < y and successor suc(x, y) as well as minimum min(x)
and maximum max(x). Here x and y are variables ranging over positions of a word which
forms a model as a labeled, linearly ordered set of positions. Formulae can be composed
by the usual Boolean connectives, i.e., if ? and ? are first-order formulae, then so are the
disjunction ? ? ?, the conjunction ? ? ?, and the negation ??. Moreover, formulae can be
composed by existential quantification ?x ? and universal quantification ?x ?. The semantics
is as usual; see e.g. [13, 32]. We use the notation ?(x1, . . . , xn) to indicate that at most the
variables x1, . . . , xn occur freely in ?. We write u |= ?(i1, . . . , in) for u ? A? and positions
ij ? pos(u) if ? is true over u with xj being interpreted by ij . A formula without free
variables is a sentence and in this case we simply write u |= ?. For any class F of first-order
formulae, F [C] is the restriction to formulae in F which, apart from >, ?, label predicates,
and equality, only use predicates in C ? {<, suc, min, max}.
The fragment FO2 = FO2[<, suc, min, max] of first-order logic contains all formulae
which use at most two different names for variables, say x and y. For FO2-formulae ?(x)
with free variable x we stipulate the convention that ?(y) is the FO2-formula obtained by
interchanging x and y. Using De Morgan?s laws and the usual dualities between existential
and universal quantifiers, one can see that every formula in FO2 is equivalent to a formula
with negations only applied to atomic formulae. We call such formulae negation-free (since
negations could be eliminated by adding negative predicates to an extended signature). The
fragment FO2m consists of all formulae in FO2 with quantifier alternation depth at most m,
i.e., formulae such that the negation-free counterpart has at most m blocks of quantifiers
on every path of the parse tree. Therefore, if we drop the two-variable restriction, every
FO2m-formula admits a prenex normal form with m blocks of quantifiers. In other words
negation-free formulae in FO2m have at most m ? 1 alternations of nested existential and
universal quantifiers. Note that FO2m is closed under negation. The fragment FOm,n contains
2
all formulae in FO2m with quantifier depth at most n.
Unary Temporal Logic. Unary temporal logic TL[X, F, Y, P] consists of all formulae built
from > for true, ? for false, labels a with a ? A, compositions using Boolean connectives
as in first-order logic, and temporal modalities X ?, F ?, Y ?, and P ? for ? ? TL[X, F, Y, P].
Formulae of unary temporal logic are interpreted over a word relative to a current position.
The semantics is declared by the following FO2-formulae in one free variable: We let
a(x) ? ?(x) = a and
(X ?)(x) ? ?y suc(x, y) ? ?(y) , (F ?)(x) ? ?y x ? y ? ?(y) ,
(Y ?)(x) ? ?y suc(y, x) ? ?(y) , (P ?)(x) ? ?y y ? x ? ?(y) .
Here and in the sequel, ? means syntactic equality. We often use this symbol instead of
equality in order to avoid confusion with the symbol = occurring in atomic predicates. The
formulae for the remaining constructs are as usual. The modalities X (neXt) and F (Future)
are called future modalities whereas the modalities Y (Yesterday) and P (Past) are called
past modalities. In order to define u |= ? without a distinguished position in u, we start
evaluation in front (position 0) for future modalities and after (position |u| + 1) the word u
for past modalities. More formally, for a word u ? A? we define u 6|= a and
u |= X ? if and only if u |= ?(1),
u |= F ? if and only if u |= F ?(1),
u |= Y ? if and only if u |= ?(|u|), u |= P ? if and only if u |= P ?(|u|).
Boolean connectives and atomic formulae > and ? are defined as usual. For example, the
formula X a ? Y b defines the language aA?b. Let TLm[X, F, Y, P] be the fragment of unary
temporal logic consisting of the Boolean combinations of formulae with at most m ? 1 nested
negations. Let TLm,n[X, F, Y, P] consist of all formulae in TLm[X, F, Y, P] with operator
depth at most n, i.e., there are at most n nested temporal modalities. For a formula ? in
first-order logic or in unary temporal logic, let L(?) = {u ? A+ | u |= ?} be the language
defined by ?.
Algebra. Let S be a finite semigroup. An element x ? S is idempotent if x2 = x. The set
of all idempotents of S is denoted E(S). For every finite semigroup S there exists an integer
? ? 1 such that each ?-power is idempotent in S. Green?s relations are an important concept
in the structure theory of finite semigroups: For x, y ? S let x ?R y if x = y or x ? yS and
symmetrically let x ?L y if x = y or x ? Sy. For G ? {R, L} let x G y if both x ?G y and
yu ??GSx+; faonrdalewtoxrd<with letters from S. For words u, v ? S+ we say that a relation u G v
G y if x ?G y but not y ?G x. We also view S as an alphabet and write
?holds in S?, if the relation is satisfied after evaluating u and v in S. We use this frequently
for equality and Green?s relations. All semigroups in this paper are nonempty.
Classes of finite semigroups are often defined by identities of omega-terms. An omega-term
over a set of variables ? is defined inductively. Every x ? ? is an omega-term, and if u
and v are omega-terms, then so are uv and u?. A finite semigroup S satisfies the identity
u = v if for each homomorphism h : ?+ ? S we have h(u) = h(v). Here, h is extended to
omega-terms by letting h(u?) be the idempotent generated by h(u).
For every e ? E(S) the set eSe forms the so-called local monoid at e. A semigroup S
belongs to LDA if every local monoid eSe satisfies (xy)?x(xy)? = (xy)?. This is equivalent
to saying that we have (exeye)?exe(exeye)? = (exeye)? in S for all x, y ? S and all
e ? E(S). Note that if S is in LDA and if e ? E(S) and x, y ? eSe then, (xy)? =
(xy)??1x(yx)?y = (xy)??1x(yx)?y(yx)?y = (xy)2?y(xy)? = (xy)?y(xy)?. Thus despite its
asymmetric definition, LDA is left-right-symmetric.
A homomorphism h : A+ ? S to a finite semigroup S recognizes a language L ? A+
if h?1(h(L)) = L. A semigroup S recognizes L ? A+ if there exists a homomorphism
h : A+ ? S which recognizes L. For u, v ? A+ let u ?L v if puq ? L is equivalent to pvq ? L
for all p, q ? A?. The relation ?L over A+ is a congruence and the semigroup A+/?L, also
denoted by Synt(L) and called the syntactic semigroup of L, is the unique minimal semigroup
recognizing L. Moreover, it is effectively computable (e.g. from an automaton for L); cf. [19].
3
Alternation within Two-Variable First-Order Logic with Successor
We define classes Wm of finite semigroups which will yield an algebraic characterization
of FO2m[<, suc]. To this end, we inductively define sequences of omega-terms Um, Vm with
variables e, f , xi, yi, s, t, pi, qi. For m = 1 we define U1 = (e?sf ?x1e?)?s(f ?y1e?tf ?)? and
V1 = (e?sf ?x1e?)?t(f ?y1e?tf ?)? and for m ? 2
Um = (pmUm?1qmxm)?pmUm?1qm(ympmUm?1qm)?,
Vm = (pmUm?1qmxm)?pmVm?1qm(ympmUm?1qm)?.
By definition, a semigroup is in Wm if it satisfies the identity Um = Vm. The class W1
is Knast?s algebraic characterization of dot-depth one [9]. The only difference between U1
and V1 is the central variable in U1 being s and in V1 being t. Intuitively, this difference is
hidden more and more in Um and Vm with increasing m.
The following result is the main contribution of this paper. The remainder of this section
is dedicated to its proof.
I Theorem 1. Let m ? 2 and let L ? A+. The following assertions are equivalent:
1. L is definable in FO2m[<, suc].
2. L is definable in TLm[X, F, Y, P].
3. Synt(L) ? Wm.
Before turning to the proof of Theorem 1 we record the following decidability corollary.
For m = 1 it relies on a characterization of two-sided ideals inside dot-depth one [11].
I Corollary 2. For every positive integer m one can decide whether a given regular language
L ? A+ is definable in FO2m[<, suc]. J
We start with the hard part of the proof of Theorem 1, i.e., with the implication from (3)
to (1). This is essentially Proposition 13 whose proof requires some preparatory work:
We first show that every Wm is contained in LDA (Lemma 3) which allows us to use a
combinatorial property of LDA (given in Lemma 6). Then a relativization technique for
FO2m (Lemma 7) is used for defining a congruence ?m,n (Definition 8) as a tool for FO2m.
The connection between this congruence and FO2m is established by Lemma 10. Using
a string rewriting system, a special factorization (given in Lemma 12) finally leads to an
inductive scheme to prove Proposition 13.
In the proof of Theorem 1 at the very end of this section we sketch how to show the
reverse implication as well as how to incorporate unary temporal logic.
I Lemma 3. For all m ? 1 we have Wm ? LDA.
Proof. Let S be a finite semigroup and let ? ? 1 be an integer such that x? is idempotent
for all x ? S. Let x, y ? S and let e ? S be idempotent. Setting e1 = f1 = s = e, x1 = xey,
y1 = x, t = y we get U1 = (exeye)? in S and V1 = (exeye)?eye(exeye)? in S. Setting
all other variables occurring in Um or in Vm to be e, we see Um = (exeye)? in S and
Vm = (exeye)?eye(exeye)? in S. Thus if S ? Wm and e ? E(S), then eSe satisfies the
identity (xy)? = (xy)?y(xy)?, i.e., S ? LDA. J
The next lemma is an intermediate result for Lemma 5 and Lemma 6 both of which yield
important combinatorial properties of semigroups in LDA.
I Lemma 4. Let S ? LDA, let x, y, z ? S, and let e ? E(S).
1. If xe R ye in S, then xe R xez if and only if ye R yez.
2. If ex L ey in S, then ex L zex if and only if ey L zey.
Proof. Since LDA is left-right symmetric, it suffices to show (1). Suppose xe R xez.
Since ye R xe R xez there exist s, t such that xe = yes and ye = xezt. We get ye =
ye(esezte). Pumping the factor in the parentheses and using LDA yields ye = ye(esezte)? =
ye(esezte)?ezte(esezte)? ? yezS. J
I Lemma 5. Let S ? LDA, let u, v ? S+, let s, t ? S? with alph|S|+1(vs) = alph|S|+1(vt)
and |v| ? |S|.
1. If u R uv in S, then u R uvs in S if and only if u R uvt in S.
2. If u L vu in S, then u L svu in S if and only if u L tvu in S.
Proof. Since LDA is left-right symmetric, it suffices to show (1). Assume u R uv R uvs
in S. We want to show u R uvt in S. This is trivial if t is the empty word. Otherwise we
factorize vt = pwz such that |w| < |wz| = |S| + 1 with w = we in S for some idempotent e
of S. Note that every sequence x1, . . . , x|S| ? S has a prefix which admits an idempotent
stabilizer, i.e., there exists i ? {1, . . . , |S|} and e ? E(S) such that x1 ? ? ? xi = x1 ? ? ? xie in S;
see e.g. [11, Lemma 1] for a proof of this claim. Since vs and vt have the same factors
of length |S| + 1, we find a factorization vs = s1wzs2. Let x = us1w and y = upw. By
induction u R y and thus xe = x R y = ye in S. Moreover, xe R xez and by Lemma 4 we
see ye R yez in S. This implies the claim. J
Choosing s to be the empty word and t = a immediately yields the following consequence.
u |= h?i[v;w](i, j) iff u2 |= ?(i ? |u1v| , j ? |u1v|) for all |u1v| < i, j ? |u1vu2|.
Moreover, if ? ? FO2m,n[<, suc], then
1. h?i<Xw ? FO2m+1,n+|w|[<, suc] and h?i>Xw ? FO2m,n+|w|[<, suc],
2 2
2. h?i<Yv ? FOm,n+|v|[<, suc] and h?i>Yv ? FOm+1,n+|v|[<, suc], and
3. h?i[v;w] ? FO2m+1,n+N [<, suc] for N = max {|v| , |w|}.
The relativization of the previous lemma leads to the congruence in the following definition.
This congruence is our tool for the combinatorics of FO2m in the subsequent proofs.
I Definition 8. Let u, v ? A?. For m, n ? 0 we let u ?m,0 v and u ?0,n v. For n ? 1 let
u ?1,n v if u and v are contained in the same monomials w1A+w2 ? ? ? A+w` with wi ? A+
and |w1 ? ? ? w`| ? n. For m ? 2 and n ? 1 let u ?m,n v if alphk(u) = alphk(v) and
prefk(u) = prefk(v) and suffk(u) = suffk(v) for all k ? n, and all of the following hold:
I Lemma 6. Let S ? LDA, let u, v ? S+, let a ? S and let |v| ? |S|.
1. If u R uv >R uva in S, then alph|S|+1(v) 6= alph|S|+1(va).
2. If u L vu >L avu in S, then alph|S|+1(v) 6= alph|S|+1(av).
The next lemma gives the main combinatorial properties of FO2m[<, suc] for our purpose,
namely relativizations of formulae to certain factors of deterministic factorizations.
I Lemma 7. Let ? ? FO2[<, suc] and let v, w ? A+.
1. There exist formulae h?i<Xw and h?i>Xw such that for all u = u1wu2 with a unique
occurrence of the factor w in the prefix u1w:
u |= h?i<Xw(i, j) iff u1 |= ?(i, j) for all 1 ? i, j ? |u1|,
u |= h?i>Xw(i, j) iff u2 |= ?(i ? |u1w| , j ? |u1w|) for all |u1w| < i, j ? |u|.
2. There exist formulae h?i<Yv and h?i>Yv such that for all u = u1vu2 with a unique
occurrence of the factor v in the suffix vu2:
u |= h?i<Yv(i, j) iff u1 |= ?(i, j) for all 1 ? i, j ? |u1|,
u |= h?i>Yv(i, j) iff u2 |= ?(i ? |u1v| , j ? |u1v|) for all |u1v| < i, j ? |u|.
3. There exists a formula h?i[v;w] such that for all u = u1vu2wu3 with a unique occurrence
of the factor v in vu2wu3 and a unique occurrence of the factor w in u1vu2w:
J
J
1. if u = u1wu2 and v = v1wv2 with 1 ? |w| ? n such that the factor w has a unique
occurrence in the prefixes u1w and v1w, then u1 ?m?1,n?|w| v1 and u2 ?m,n?|w| v2,
2. if u = u1wu2 and v = v1wv2 with 1 ? |w| ? n such that the factor w has a unique
occurrence in the suffixes wu2 and wv2, then u1 ?m,n?|w| v1 and u2 ?m?1,n?|w| v2,
3. if u = u1wu2w0u3 and v = v1wv2w0v3 with |ww0| ? n such that the factor w has a unique
occurrence in the suffixes wu2w0u3 and wv2w0v3 and such that the factor w0 has a unique
occurrence in the prefixes u1wu2w0 and v1wv2w0, then u2 ?m?1,n?|ww0| v2. J
An elementary verification shows that ?m,n is a congruence. Since this fact is not used in
this paper, we do not record it as lemma. The following is also straightforward.
I Lemma 9. If m, n ? 1 and u, v ? A? with u ?m,n v, then u ?m?1,n v and u ?m,n?1 v. J
The next lemma connects FO2m,n with the combinatorial properties captured by ?m,n.
2
For u, v ? A? let u ?1,n v if u and v model the same formulae in FO1,n[<, suc, min, max].
2
For m ? 2 and u, v ? A? let u ?m,n v if u and v model the same formulae in FOm,n[<, suc].
We have to include min and max predicates at level 1 for technical reasons.
I Lemma 10. If m, n ? 0 and u, v ? A? with u ?m,n+1 v, then u ?m,n v.
In other words the previous lemma shows that ?m,n+1 is a refinement of ?m,n. In
particular, ?m,n has finite index. The next lemma is an auxiliary statement used in the
proof of Lemma 12. It says that ?1,n equivalence of u and v allows order comparison for
certain factors in the words u and v.
I Lemma 11. Let u, v ? A+ and consider factorizations u = x1u1 ? ? ? xkuk = u01y1 ? ? ? u0`y`
and v = x1v1 ? ? ? xkvk = v10y1 ? ? ? v`0y` with k, ` ? 1 and u01, v10, uk, vk ? A? and xi, yi ? A+
such that
x1u1 ? ? ? xk is the shortest prefix of u contained in x1A+x2 ? ? ? A+xk and
x1v1 ? ? ? xk is the shortest prefix of v contained in x1A+x2 ? ? ? A+xk,
y1 ? ? ? u0`y` is the shortest suffix of u contained in y1A+y2 ? ? ? A+y` and
y1 ? ? ? v`0y` is the shortest suffix of v contained in y1A+y2 ? ? ? A+y`.
Let ?u = |u| ? |x1u1 ? ? ? uk?1| ? |u02 ? ? ? u0`y`| and let ?v = |v| ? |x1v1 ? ? ? vk?1| ? |v20 ? ? ? v`0y`|.
If u ?1,n v for n = |x1 ? ? ? xk| + |y1 ? ? ? y`|, then the relative order of the occurrences of xk
and y1 is the same in u and v, i.e., one of the following conditions applies:
1. ?u > |xky1| and ?v > |xky1|.
2. ?u < 0 and ?v < 0.
3. ?u = ?v.
J
J
The main combinatorial ingredient for the implication from Wm to FO2m is the
factorization in the following lemma. It combines properties of LDA and ?m,n.
I Lemma 12. Let S ? LDA, let m ? 2, let N = 2 |S|2 and let u, v ? S+ such that
u ?m,n+N v. Then there exist factorizations u = w0s1w1 ? ? ? s`w` and v = w0t1w1 ? ? ? t`w`
with wi, si, ti ? S+ and |w0 ? ? ? w`| ? N such that for all 1 ? i ? ` the following hold:
1. si ?m?1,n ti,
2. w0s1 ? ? ? wi?1 R w0s1 ? ? ? wi?1si in S,
3. wi ? ? ? t`w` L tiwi ? ? ? t`w` in S.
Proof. Let X0 = {1} ? {i ? pos(u) | 1 < i ? |u| , u[1; i ? 1] >R u[1; i] in S} be the set of
positions of u which cause an R-descent when reading u from left to right. Let X be the
set of positions j such that there exists i ? X0 with 0 ? i ? j ? |S|, i.e., we include all |S|
positions to the left of each i ? X0. Let Y 0 and Y be defined left-right symmetrically on v,
i.e., Y 0 = {|v|} ? {i ? pos(v) | 1 < i ? |v| , v[i ? 1; |v|] >L u[i; |v|] in S} and Y is the set of
positions j such that 0 ? j ? i ? |S| for some i ? Y 0. Let X = X1 ? ? ? ? ? Xk with Xi 6= ?
being maximal subsets of consecutive positions of X such that all positions of Xi are smaller
than all positions of Xi+1. Symmetrically, let Y = Y1 ? ? ? ? ? Yk0 with Yi 6= ? being maximal
subsets of consecutive positions of Y such that all positions of Yi are smaller than all positions
of Yi+1.
Let xi = u[Xi] and yi = u[Yi] be the factors of u and v covered by the positions of Xi
and Yi, respectively. By construction and Lemma 6 (1), we see that u[1; max(Xi)] is the
shortest prefix of u which is contained in x1S+x2 ? ? ? S+xi. Symmetrically, v[min(Yi); |v|] is
the shortest suffix of v which is contained in yiS+yi+1 ? ? ? S+yk0 by Lemma 6 (2). We use
these properties to transfer the positions of X to v and the positions of Y to u. Specifically we
let Y 00 = Y 00 ? ? ? ? ? Yk000 be such that each Yi00 is an interval of positions of u with u[Yi00] = yi
1
and u[min(Yi00); |u|] is the shortest suffix of u which is contained in yiS+yi+1 ? ? ? S+yk0 . And
we let X = X100 ? ? ? ? ? Xk00 be such that each Xi00 is an interval of positions of v with v[Xi00] = xi
and v[1; max(X00)] is the shortest prefix of v which is contained in x1S+x2 ? ? ? S+xi. Note
i
that u ? S?y1S+y2 ? ? ? S+yk0 and v ? x1S+x2 ? ? ? S+xkS? because u ?m,n+N v.
Now, consider the factorization u = w0s1w1 ? ? ? s`w` with si ? S+ such that the wi are the
factors covered by maximal subsets of consecutive positions in X ? Y 00. Intuitively, this means
that we merge overlapping and adjacent factors xi and yj in u. Lemma 11 shows that the
relative order of those concrete occurrences of xi and yj is the same in v as in u. Therefore,
if we consider the factorization of v which is covered by maximal subsets of consecutive
positions in X00 ? Y , then we end up with the same factors in the same order, i.e., we have
v = w0t1w1 ? ? ? t`w` for some ti ? S+. Since the R-class and the L-class can descend at most
|S| ? 1 times, we have |X0 ? Y 0| ? 2 |S| and thus |w0 ? ? ? w`| ? |X ? Y 00| ? 2 |S|2. Moreover,
by construction every R-descent when reading prefixes of u as well as every L-descent when
reading suffixes of v is covered by some factor wi showing (2) and (3).
It remains to show si ?m?1,n ti for all i. An intermediate step is the following claim.
Claim. If skwk ? ? ? s`w` ?m,n+N tkwk ? ? ? t`w` for some N ? |wk ? ? ? w`|, then si ?m?1,n ti
for all i ? {k, . . . , `}.
The proof of this claim is by induction on ` ? k. Every wi either arises from some xj or
some yj or both. Therefore, the wi?s inherit the properties of the corresponding xj?s and
yj?s of being the first occurrence (respectively being the last occurrence). If there is no wi
arising from an xj, then every wi has a unique occurrence in wisi+1 as well as in witi+1.
Thus si ?m?1,n ti for all i by an (` ? k)-fold application of condition (2) in the definition
of ?m,n (from right to left). For i = k this uses Lemma 9.
Fix the first wi which arises from an xj. We have sj ?m?1,n tj for all j > i by condition (1)
in the definition of ?m,n and induction. If i = k, then sk ?m?1,n tk again by condition (1)
in the definition of ?m,n. Assume therefore i > k in the sequel. Let h ? i be minimal
such that wh arises from some yj; note that w` arises from yk0 . By a repeated application
of condition (2) in the definition of ?m,n we get that skwk ? ? ? sh ?m,n+N0 tkwk ? ? ? th for
N 0 = |wk ? ? ? wh?1|. Now wi?1 has a unique occurrence in each of the words wi?1si ? ? ? sh
and wi?1ti ? ? ? th. Therefore, by repeatedly applying condition (2) in the definition of ?m,n
we see that sj ?m?1,n tj for all k ? j < i. If h > i, then by condition (3) in the definition
of ?m,n we see that si ?m?1,n ti; and if h = i, then this follows from condition (2) in the
definition of ?m,n. This concludes the proof of the claim.
Now by condition (1) in the definition of ?m,n, we see s1w1 ? ? ? s`w` ?m,n+N0 t1w1 ? ? ? t`w`
for N 0 = N ? |w0| and the above claim yields sj ?m?1,n tj for all 1 ? j ? `.
J
The following proposition essentially shows how to pass from Wm to FO2m[<, suc]. The
key to its proof is a string rewriting system which enables induction on the parameter m.
Intuitively we consider the maximal quotient of a semigroup in Wm contained in Wm?1.
Since the latter is given by an omega-identity, this quotient can be described by a string
rewriting system. A single rewriting step of this system corresponds to one application of
the omega-identity for Wm?1 and can be lifted to Wm relatively easily.
I Proposition 13. For every S ? Wm with m ? 1 there exists n ? 1 such that u ?m,n v
implies u = v in S for all u, v ? S+.
Proof. We perform an induction on m. By Knast?s Theorem [9], if L is recognized by a
semigroup S ? W1, then the language L is a Boolean combination of monomials w1A+w2 ? ? ? A+w`.
Choosing n ? 1 such that for all these monomials we have |w1 ? ? ? w`| ? n yields the claim
for m = 1.
Let ? > |S| be an integer such x? is idempotent in S for all x ? S. Consider the
relation ? on S+ given by s ? t if s = t in S or if s = pum?1q and t = pvm?1q for some
p, q ? S? and some xi, e, yi, f, pi, qi, z, z0 ? S+ such that u1 = (e?zf ?x1e?)?z(f ?y1e?z0f ?)?
and v1 = (e?zf ?x1e?)?z0(f ?y1e?z0f ?)? and for i ? 2 we have
ui = (piui?1qixi)?piui?1qi(yipiui?1qi)?, vi = (piui?1qixi)?pivi?1qi(yipiui?1qi)?.
Let ?? be the reflexive, symmetric and transitive closure of ?. The relation ?? is a congruence
of finite index (since S+/ ?? is a quotient of S). Moreover x? ?? x2? for all x ? S+ and
S+/ ?? ? Wm?1.
Claim 1. Let u, s, t ? S+. If s ? t, then u R us in S if and only if u R ut in S.
Assume without restriction that s 6= t in S. We have alph|S|+1(s) = alph|S|+1(t) by
construction of um?1 and vm?1. Note that by choice of ?, in particular both words have the
same prefix and the same suffix of length |S| + 1. Lemma 5 yields Claim 1.
Claim 2. Let u, v, s, t ? S+ with s ?? t. If u R us and v L tv in S, then usv = utv in S.
Since s ?? t, there exists k ? 0 and w0, . . . , wk ? S+ such that s = w0 and wk = t and such
that either wi?1 ? wi or wi ? wi?1 for each 1 ? i ? k. Claim 1 and its left-right dual, yield
that u R uwi and v L wiv in S for all i. It therefore suffices to show the claim for s ? t. The
claim is trivial if s = t in S. Otherwise suppose s = pmum?1qm and t = pmvm?1qm. Since
u R us in S, there exists xm ? S such that u = usxm in S. Since v L tv in S, the left-right
dual of Claim 1 implies v L sv in S. Hence, there exists ym ? S such that v = ymsv in S.
Now u = u(pmum?1qmxm)? in S and v = (ympmum?1qm)?v in S and with S ? Wm we see
usv = u(pmum?1qmxm)?pmum?1qm(ympmum?1qm)?v
= u(pmum?1qmxm)?pmvm?1qm(ympmum?1qm)?v = utv in S,
thus establishing Claim 2.
Since S+/ ?? ? Wm?1, by induction there exists n ? 1 such that s ?m?1,n t implies
s ?? t for all s, t ? S+. Let u, v ? S+ and suppose u ?m,n+N v for N = 2 |S|2. Let
u = w0s1w1 ? ? ? s`w` and v = w0t1w1 ? ? ? t`w` be the factorizations given by Lemma 12; in
particular si ?m?1,n ti and w0s1 ? ? ? wi?1 R w0s1 ? ? ? wi?1si in S and wi ? ? ? t`w` L tiwi ? ? ? t`w`.
By choice of n we have si ?? ti for all i and repeated application of Claim 2 yields the
following chain of identities valid in S:
v = w0t1w1t2 ? ? ? t`?1w`?1t`w`
= w0s1w1t2 ? ? ? t`?1w`?1t`w`
Proof of Theorem 1. We shall first show ?(3) ? (1)?. Afterwards we sketch the proof for
the implications ?(1) ? (3)? and ?(1) ? (2)?; note that the implication ?(2) ? (1)? is trivial
because the semantics of temporal logic formulae is given by two-variable first-order formulae
with quantifier alternations originating in negations. We refer to the technical report [14] for
full proofs.
?(3) ? (1)?: Suppose S ? Wm and the homomorphism h : A+ ? S recognizes L ? A+.
Combining Proposition 13 and Lemma 10, we see that there exists an integer n ? 1 such that
u ?m,n v for u, v ? S+ implies u = v in S. Now if u ?m,n v for u, v ? A+, then h(u) = h(v).
Thus, by specifying the ?m,n-classes of A+ which are contained in L, we obtain a formula
2
? ? FOm,n[<, suc] such that L(?) = h?1(h(L)) = L. Note that the syntactic semigroup
of L recognizes L.
Sketch of ?(1) ? (3)?: The overall proof scheme is reminiscent of a recent proof of
Straubing [27] which shows that FO2m[<]-definable languages are recognized by a monoids in the
so-called weakly iterated two-sided semidirect product ((J??J)??J) ? ? ??? J where J appears n
times. To avoid technical notation our formulation is not in terms of semidirect products,
however. More concretely we show that formulae in FO2m up to a certain quantifier depth
are unable to disprove the defining identity of Wm; this yields a recognizing semigroup of L
and thus the claim since the syntactic semigroup is a divisor of any semigroup recognizing L.
To this end, an extended alphabet is used to annotate every position by certain information
about sequences of factors occurring in the prefix ending and the suffix starting at this
position. This allows to reduce the alternation depth of formulae by replacing so-called
innermost quantified blocks by alphabet information. Induction then yields the claim. The
most technical part of this step is to enable induction by showing that certain central factors
of the annotated identities for Wm are obtained from the identities for Wm?1 over the
extended alphabet.
Sketch of ?(1) ? (2)?: This can be seen using the construction in [7, proof of Theorem 1]
by means of which Etessami, Vardi, and Wilke showed that FO2 coincides with TL[X, F, Y, P];
their statement does not involve the alternation depth explicitly, though. Roughly speaking,
for every formula in FO2m with one free variable an equivalent formula in TLm[X, F, Y, P] is
constructed. The idea is to split up quantifier with respect to the order type. For example,
the quantifier ?x ? is equivalent to the disjunction
(?x < y ? 1 : ?) ? (?x = y ? 1 : ?) ? (?x = y : ?) ? (?x = y + 1 : ?) ? (?x > y + 1 : ?).
In addition, we make explicit the label of the variable x and use syntactic bookkeeping to keep
track of the label and the order type. Under the condition that these information be correct,
induction yields temporal logic formula for the subformula ?. Now this presupposition can
be ensured using the modalities X, F, Y, and P. For example, the subformula in the first
parentheses would be YYP ?0 where ?0 is the formula for ? with respect to the label and the
order type x < y ? 1 which is obtained by induction. J
Conclusion
We showed that quantifier alternation for the logic FO2[<, suc] is decidable by giving a single
identity of omega-terms for each level FO2m[<, suc]. The key ingredient in our proof is a
rewriting technique which allows us to apply induction on m.
There is an algebraic construction V 7? V ? D in terms of wreath products, see e.g. [26].
For most logical fragments F , whenever F corresponds to a variety of finite monoids V,
then the fragment F 0 obtained from F by adding successor predicates corresponds to the
semigroup variety V ? D. This is also the case for FO2m[<] and FO2m[<, suc]. Therefore,
if Vm is the variety of finite monoids corresponding to FO2m[<], then our result implies
Vm ? D = Wm.
In general, decidability of V is not preserved by the operation V 7? V ? D, but a
particularly nice situation occurs if V ? D = LV. Here, a semigroup S is in LV if all local
monoids of S are in V. For example the variety DA satisfies DA ? D = LDA, see [1, 4]. For
W1 however, Knast has given an example showing V1 ? D 6= LV1. In view of this example,
we conjecture that Vm ? D 6= LVm for all m ? 1.
Acknowledgments. We thank the anonymous referees for their suggestions which helped
to improve the presentation of the paper.
1
2
3
4
5
6
7
8
9
10
J. Almeida . A syntactical proof of locality of DA . Int. J. Algebra Comput. , 6 ( 2 ): 165 - 177 , 1996 .
J. R. B?chi . Weak second-order arithmetic and finite automata . Z. Math. Logik Grundlagen Math. , 6 : 66 - 92 , 1960 .
R. S. Cohen and J. A. Brzozowski . Dot-depth of star-free events . J. Comput. Syst. Sci. , 5 ( 1 ): 1 - 16 , 1971 .
A. Costa and A. P. Escada . Some operators that preserve the locality of a pseudovariety of semigroups . Technical Report 11-37 DMUC , University of Coimbra, 2011 .
V. Diekert , P. Gastin , and M. Kufleitner . A survey on small fragments of first-order logic over finite words . Int. J. Found. Comput. Sci. , 19 ( 3 ): 513 - 548 , 2008 . Special issue DLT 2007 .
Amer. Math. Soc. , 98 : 21 - 51 , 1961 .
K. Etessami , M. Y. Vardi , and Th . Wilke. First-order logic with two variables and unary temporal logic . Inf. Comput. , 179 ( 2 ): 279 - 295 , 2002 .
J. A. W. Kamp . Tense Logic and the Theory of Linear Order . PhD thesis , University of California, 1968 .
R. Knast . A semigroup characterization of dot-depth one languages . RAIRO , Inf. Th?or., 17 ( 4 ): 321 - 330 , 1983 .
A. Krebs and H. Straubing . An effective characterization of the alternation hierarchy in twovariable logic . In FSTTCS 2012 , Proceedings , volume 18 of LIPIcs , pages 86 - 98 . Dagstuhl Publishing, 2012 .
M. Kufleitner and A. Lauser . Around dot-depth one . Int. J. Found. Comput. Sci. , 23 ( 6 ): 1323 - 1339 , 2012 .
In MFCS 2012 , Proceedings , volume 7464 of LNCS , pages 603 - 614 . Springer, 2012 .
M. Kufleitner and A. Lauser . Lattices of logical fragments over words . In ICALP 2012 , Proceedings Part II , volume 7392 of LNCS , pages 275 - 286 . Springer, 2012 .
M. Kufleitner and A. Lauser . Quantifier alternation in two-variable first-order logic with successor is decidable . CoRR , arXiv: 1212 .6500 [cs.LO], 2012 .
M. Kufleitner and P. Weil . On the lattice of sub-pseudovarieties of DA . Semigroup Forum , 81 : 243 - 254 , 2010 .
M. Kufleitner and P. Weil . The FO2 alternation hierarchy is decidable . In CSL 2012 , Proceedings , volume 16 of LIPIcs , pages 426 - 439 . Dagstuhl Publishing, 2012 .
M. Kufleitner and P. Weil . On logical hierarchies within FO2-definable languages . Log.
Methods Comput . Sci., 8 : 1 - 30 , 2012 .
R. McNaughton and S. Papert. Counter-Free Automata . The MIT Press, 1971 .
J.-? . Pin. Varieties of Formal Languages. North Oxford Academic , 1986 .
Syst., 30 ( 4 ): 383 - 422 , 1997 .
M. P. Sch?tzenberger . On finite monoids having only trivial subgroups . Inf. Control , 8 : 190 - 194 , 1965 .
M. P. Sch?tzenberger . Sur le produit de concat?nation non ambigu . Semigroup Forum , 13 : 47 - 75 , 1976 .
Th. Schwentick , D. Th?rien , and H. Vollmer . Partially-ordered two-way automata: A new characterization of DA . In DLT 2001 , Proceedings , volume 2295 of LNCS , pages 239 - 250 .
Springer , 2002 .
I. Simon . Piecewise testable events . In Autom. Theor. Form. Lang., 2nd GI Conf. , volume 33 of LNCS , pages 214 - 222 . Springer, 1975 .
Comput. Sci. , 13 : 137 - 150 , 1981 .
H. Straubing . Finite semigroup varieties of the form V?D . J. Pure Appl. Algebra , 36 ( 1 ): 53 - 94 , 1985 .
H. Straubing . Algebraic characterization of the alternation hierarchy in FO2[<] on finite words . In CSL 2011 , Proceedings , volume 12 of LIPIcs , pages 525 - 537 . Dagstuhl Publishing, 2011 .
P. Tesson and D. Th?rien . Diamonds are forever: The variety DA . In Semigroups, Algorithms, Automata and Languages 2001 , Proceedings, pages 475 - 500 . World Scientific, 2002 .
D. Th?rien . Classification of finite monoids: The language approach . Theor. Comput. Sci. , 14 ( 2 ): 195 - 208 , 1981 .
D. Th?rien and Th. Wilke. Over words, two variables are as powerful as one quantifier alternation . In STOC 1998, Proceedings , pages 234 - 240 . ACM Press, 1998 .
W. Thomas . Classifying regular events in symbolic logic . J. Comput. Syst. Sci. , 25 : 360 - 376 , 1982 .
W. Thomas . Languages, automata and logic . In Handbook of Formal Languages , volume 3 , pages 389 - 455 . Springer, 1997 .
Akad. Nauk SSSR , 140 : 326 - 329 , 1961 .
P. Trotter and P. Weil . The lattice of pseudovarieties of idempotent semigroups and a non-regular analogue . Algebra Univers. , 37 ( 4 ): 491 - 526 , 1997 .
Ph. Weis and N. Immerman . Structure theorem and strict alternation hierarchy for FO2 on words . Log. Methods Comput. Sci. , 5 : 1 - 23 , 2009 .