Separation and the Successor Relation
S TA C S
Separation and the Successor Relation?
Thomas Place 0
Marc Zeitoun 0
0 LaBRI, Bordeaux University , France
We investigate two problems for a class C of regular word languages. The Cmembership problem asks for an algorithm to decide whether an input language belongs to C. The Cseparation problem asks for an algorithm that, given as input two regular languages, decides whether there exists a third language in C containing the first language, while being disjoint from the second. These problems are considered as means to obtain a deep understanding of the class C. It is usual for such classes to be defined by logical formalisms. Logics are often built on top of each other, by adding new predicates. A natural construction is to enrich a logic with the successor relation. In this paper, we obtain new and simple proofs of two transfer results: we show that for suitable logically defined classes, the membership, resp. the separation problem for a class enriched with the successor relation reduces to the same problem for the original class. Our reductions work both for languages of finite words and infinite words. The proofs are mostly selfcontained, and only require a basic background on regular languages. This paper therefore gives simple proofs of results that were considered as difficult, such as the decidability of the membership problem for the levels 1, 3/2, 2 and 5/2 of the dotdepth hierarchy. A central problem in the theory of formal languages is to characterize and understand the expressive power of high level specification formalisms. Monadic second order logic (MSO) is such a formalism, which is both expressive and robust. For several classes of structures, such as words or trees, it has the same expressive power as finite automata and defines the class of regular languages. In this paper, we investigate fragments of MSO over words. In this context, understanding the expressive power of a fragment is associated to two decision problems: the membership problem and the separation problem. For a fixed logical fragment F , the F membership problem asks for a decision procedure that tests whether some input regular language can be expressed by a formula from F . To obtain such an algorithm, one has to consider and understand all properties that can be expressed within F , which requires a deep understanding of the fragment F . On the other hand, the F separation problem is more general. It asks for a decision procedure that tests whether given two input regular languages, there exists a third one in F containing the first language while being disjoint from the second one. Since regular languages are closed under complement, membership reduces to separation: a language is in F if and only if it can be separated from its complement. Usually, the
and phrases separation problem; regular word languages; logics; decidable characterizations; semidirect product

1998 ACM Subject Classification F.4.3 Formal Languages
Introduction
separation problem is more difficult than the membership problem but also more rewarding
with respect to the knowledge gained on the investigated fragment F .
These two problems have been considered and solved for many natural fragments of
monadic second order logic. Among these, the most prominent one is firstorder logic, FO(<),
equipped with a predicate < for the linear ordering. The solution to the membership problem,
known as the McNaughtonPapertSch?tzenberger Theorem [20, 10], has been revisited until
recently [5]. The theorem states that a regular language is definable in FO(<) if and only if
its syntactic semigroup is aperiodic. The syntactic semigroup is a finite algebraic object that
can be computed from any regular language. Since aperiodicity can be defined as an equation
that needs to be satisfied by all of its elements, this yields decidability of FO(<)definability.
This result now serves as a template, which is commonly followed in this line of research.
The separation problem has also been successfully solved for firstorder logic [7]. Actually,
the problem was first addressed in a purely algebraic framework, and was later identified as
equivalent to our separation problem [2]. As for membership, this problem is still revisited
today and a new selfcontained and combinatorial proof was obtained in [18].
Motivation. We are interested in natural fragments of FO(<) obtained by restricting either
the number of variables or the number of quantifier alternations allowed in formulas. Such
restrictions in general give rise to several variants of the same fragment. Indeed, in most
cases, the drop in expressive power forbids the use of natural relations that could be defined
from the linear order in FO(<). The main example considered in this paper is +1, the
successor relation, together with predicates min and max for the first and last positions
in a word. This means that one can define two distinct variants of the same fragment
depending on whether we decide to explicitly add these predicates in the signature or not.
An example is the fragment ?n, which consists of firstorder formulas whose prenex normal
form has at most (n ? 1) quantifier alternations and starts with an existential block. Since
defining +1 requires an additional quantifier alternation, ?n(<, +1, min, max) has indeed
stronger expressiveness than ?n(<). The motivation of this paper is to obtain decidability
results for such enriched fragments.
State of the Art. Even when the weak fragment is known to have decidable membership,
proving that the enriched one has the same property can be nontrivial. Examples include the
membership proofs of B?1(<, +1, min, max) (Boolean combinations of ?1(<, +1, min, max)
formulas) and ?2(<, +1), which require difficult and intricate combinatorial arguments [8, 6, 9]
or a wealth of algebraic machinery [12, 13]. Another issue is that most proofs directly deal
with the enriched fragment. Given the jungle of such logical fragments, it is desirable to
avoid such an approach, treating each variant of the same fragment independently. Instead,
a satisfying approach is to first obtain a solution of the membership and separation problems
for the less expressive variant and then to lift it to other variants via a generic transfer result.
This approach has first been investigated by Straubing for the membership problem [22]
in an algebraic framework, and later adapted to be able to treat classes not closed under
complement [13]. Transferring the logical problem to this algebraic framework requires
preliminary steps, still specific to the investigated class, to prove that:
1. A language is definable in the fragment if and only if its syntactic semigroup belongs to a
specific algebraic variety V (e.g., the variety of aperiodic monoids for FO(<)), and
2. Membership to V is decidable.
Next, though this is not immediate, for most fragments of FO(<), it has been proved that
3. When the weaker variant corresponds to a variety V, the variant with successor corresponds
to the variety V ? D, built generically from V.
Hence, Straubing?s approach was to prove that
4. the operator V 7? V ? D preserves decidability.
Unfortunately, this is not true in general [3]. Actually, while decidability is preserved for all
known logical fragments, there is no generic result that captures them all. In particular, for
the less expressive fragments, one has to use completely ad hoc proofs. In the separation
setting, things behave well: it has been shown that decidability of separation is preserved by
the operation V 7? V ? D [21]. While interesting when already starting from algebra, this
approach has several downsides:
Dealing with algebra hides the logical intuitions, while our primary goal is to understand
the expressiveness of logics.
Going from logic to algebra requires to be acquainted with new notions and vocabulary,
as well as involved theoretical tools. Proofs are also often nontrivial and require a deep
understanding of complex objects, which may be scattered in the bibliography.
Despite step 4, which is generic to some extent, arguments specific to the investigated
class are pushed to steps 1?3, and they are often nontrivial.
Contributions. We give a new proof that decidability of separation can be transferred from
a weak to an enriched fragment. We present the result in two different forms.
The first one is nonalgebraic: we work directly with the logical fragments, without using
varieties. The transfer result is generic and its proof mostly is: the only specific argument
is an EhrenfeuchtFra?ss? game that can be adapted to all natural fragments with minimal
difficulty (we prove it in the long version of this paper for all considered fragments, see [19]).
The benefits of this new proof are that:
1. It is selfcontained and much simpler than previous ones. It only relies on two basic
wellknown notions: recognizability by semigroups and EhrenfeuchtFra?ss? games.
2. It works with classes that are not closed under complement, contrary to [21]. This allows
us to capture the ? and ? levels in the quantifier alternation hierarchy of firstorder logic.
3. Under an additional hypothesis on the logical fragment, which is met for most fragments
we investigate and easy to check, the decidability result of the separation problem also
extends to the membership problem.
4. The proof adapts smoothly to infinite words using the notion of ?semigroups, as shown
in the long version of this paper [19].
The second form of our result is algebraic and generic. We prove that V 7? V ? D preserves
the decidability of separation for varieties, hence giving an elementary proof of a result of [21].
Even in this algebraic form, we completely bypass involved constructions or notions, such as
pointlike sets for categories developed in [21], thus making the proof accessible.
As corollaries, since B?1(<) and ?2(<) both enjoy decidable separation [4, 16, 17], we
obtain that this is also the case for the fragments B?1(<, +1, min, max) and ?2(<, +1),
known as levels 1 and 3/2 of the dotdepth hierarchy. These new results strengthen the
previous ones [8, 6] that showed decidability of membership and were considered as difficult.
We actually obtain that separation for ?n(<, +1, min, max) reduces to separation for ?n(<).
Since we also transfer decidability of the membership problem, and since the fragments B?2(<)
of Boolean combinations of ?2(<) formulas and ?3(<) have decidable membership [17] we
deduce that the same holds for B?2(<, +1) and ?3(<, +1), known as levels 2 and 5/2 of the
dotdepth hierarchy.
Organization of the Paper. In Section 2, we set up the notation and we present the
separation problem and the logics we deal with. Section 3 is devoted to our main tool:
languages of wellformed words. In Section 4, we use it to prove our transfer result for all
fragments from the logical perspective, and in Section 5, we show that decidability of the
separation problem for the variety V entails the same for V ? D.
2
Preliminaries
In this section, we provide preliminary definitions on regular languages defined by logical
fragments and on separation. We also present our main contribution.
Words, Languages. We fix a finite alphabet A. Let A+ be the set of all nonempty finite
words and let A? be the set of all finite words over A. If u, v are words, we denote by
u ? v or by uv the word obtained by concatenating u and v. For convenience, we only
consider, without loss of generality, languages that do not contain the empty word. That is,
a language is a subset of A+. We work with regular languages, that is, languages definable
by finite automata.
Separation. Given three languages K, L, L0, we say that K separates L from L0 if
L ? K and K ? L0 = ?.
If C is a class of languages, we say that L is Cseparable from L0 if there exists K ? C that
separates L from L0. Note that if C is closed under complement, L is Cseparable from L0
if and only if L0 is Cseparable from L. However, this is not true for a class C not closed
under complement, such as the classes ?n(<) of the quantifier alternation hierarchy, which
we shall consider.
Given a class C, the Cseparation problem asks for an algorithm which, given as input two regular languages L, L0, decides whether L is Cseparable from L0. The Cmembership problem, which asks whether an input regular language belongs to C, reduces to the Cseparation problem, as a regular language belongs to C iff it is Cseparable from its complement.
Logics. We investigate several fragments of firstorder logic on finite words. We view a finite
word as a logical structure made of a sequence of positions labeled over A. We work with
firstorder logic FO(<) using a unary predicate Pa for each a ? A, which selects positions
labeled with an a, as well as binary predicates ?=? for equality and ?<? for the linear order.
Such a formula defines the regular language of all words that satisfy it. We will freely use
the name of a logical fragment of FO(<) to denote the class of languages definable in this
fragment. Observe that FO(<) is powerful enough to express the following logical relations:
First position, min(x):
Last position, max(x):
Successor, y = x + 1:
?y ?(y < x).
?y ?(x < y).
x < y ? ?(?z x < z ? z < y).
However, for most fragments of FO(<) this is not the case. For example, in the
twovariables restriction FO2(<) of FO(<), it is not possible to express successor, as it requires
quantifying over a third variable. For these fragments F , adding the predicates min, max
and +1 yields a strictly more powerful logic F +. Our goal is to prove a transfer result
for such fragments: given a fragment, if the separation problem is decidable for the weak
variant F , then it is decidable as well for the strong variant F + obtained by enriching F with
the above relations. The technique is generic, meaning that it is not bound to a particular
logic. In particular, our transfer result applies to the following wellknown logical fragments:
FO(=), the restriction of FO(<) in which the linear order cannot be used, and only
equality between two positions can be tested. The enriched fragment FO(=, +1) (min and
max can be eliminated from the formulas) defines locally threshold testable languages [24].
All levels in the quantifier alternation hierarchy of firstorder logic. A firstorder formula
is ?n(<) (resp. ?n(<)) if its prenex normal form contains at most (n ? 1) quantifier
alternations and starts with an ? (resp. a ?) quantifier block. Finally, a B?n(<) formula
is a boolean combination of ?n(<) and ?n(<) formulas.
Since for all fragments above ?2(<), a formula involving min and max can be expressed
without these predicates in the same logic, we shall denote the enriched fragments by
?1(<, +1, min, max), B?1(<, +1, min, max), and then by ?2(<, +1), B?2(<, +1), . . .
FO2(<), the restriction of FO(<) using only two reusable variables. The corresponding
enriched fragment is FO2(<, +1), since min and max can again be eliminated from the
formulas.
I Theorem 1. Let F and F + be respectively the weak and strong variants of one of the
logical fragments in Table 1. Then F +separability can be effectively reduced to F separability.
As explained in the introduction, we prove this theorem in two flavors: the first one,
Theorem 4, is purely logical. It is selfcontained and elementary, but is not entirely generic.
The other one, Theorem 15, is purely algebraic and generic: the transfer works from an
algebraic class (for which only fairly general restrictions are assumed) to an enriched one.
Yet, it relies on already established results to be instantiated on the fragments of Table 1.
All these logical fragments have a rich history and have been extensively studied in
the literature. In particular, the separation problem is known to be decidable for the
following fragments: FO(=), FO2(<), ?1(<), B?1(<), ?2(<) [4, 16, 17]. This means
that, from our results, we obtain decidability of separation for FO(=, +1), FO2(<, +1),
?1(<, +1, min, max), B?1(<, +1, min, max) and ?2(<, +1). Note that for FO(=, +1),
FO2(<, +1) and B?1(<, +1, min, max), the results could already be obtained as corollaries
of algebraic theorems of Steinberg [21] and Almeida [2]. As explained in the introduction,
an issue with this approach is that the proof of Steinberg?s result relies on deep algebraic
arguments and is not tailored to separation (the connection with separation is made by
Almeida [2]). For ?1(<, +1, min, max) and ?2(<, +1), the result is new, as Steinberg?s result
does not apply to classes of languages that are not closed under complement.
Tools: Semigroups and WellFormed Words
In this section, we define the main tools used in the paper. First, we recall the wellknown
semigroup based definition of regular languages: a language is regular if and only if it can
be recognized by a finite semigroup. Our second tool, wellformed words, is specific to our
problem and plays a key role in our transfer result.
3.1
Semigroups and Monoids
We work with the algebraic representation of regular languages in terms of semigroups. A
semigroup is a set S equipped with an associative product, written s ? t or st. A monoid is a
semigroup S having a neutral element 1S, i.e., such that s ? 1S = 1S ? s = s for all s ? S. If S
is a semigroup, then S1 denotes the monoid S ? {1S} where 1S ?/ S is a new element, acting
as neutral element. Note that we add such a new identity even if S is already a monoid.
An element e ? S is idempotent if e ? e = e. We denote by E(S) the set of idempotents
of S. Given a finite semigroup S, it is folklore and easy to see that there is an integer ?(S)
(denoted by ? when S is understood) such that for all s of S, s? is idempotent: s? = s?s?.
Note that A+ and A? equipped with concatenation are respectively a semigroup and a
monoid called the free semigroup over A and the free monoid over A. Let L ? A+ be a
language and S be a semigroup (resp. a monoid). We say that L is recognized by S if there
exist a morphism ? : A+ ? S (resp. ? : A? ? S) and a set F ? S such that L = ??1(F ).
Semigroups and Separation. The separation problem takes as input two regular languages
L, L0. It is convenient to work with a single object recognizing both of them, rather than
having to deal with two. Let S, S0 be semigroups recognizing L, L0 together with the
associated morphisms ?, ?0, respectively. Clearly, L and L0 are both recognized by S ? S0
with the morphism ? ? ?0 : A+ ? S ? S0 mapping w to (?(w), ?0(w)). From now on, we
work with such a single semigroup recognizing both languages. Replacing S ? S0 with its
image under ? ? ?0, one can also assume that this morphism is surjective. To sum up, we
assume from now on, w.l.o.g., that L and L0 are recognized by a single surjective morphism.
3.2
WellFormed Words
In this section, we define our main tool for this paper. Assume that F is the weak variant of
one of the logical fragments of Table 1 and let F + be the corresponding enriched variant.
To any semigroup morphism ? : A+ ? S into a finite semigroup S, we associate a new
alphabet A? called the alphabet of wellformed words. The main intuition behind this notion
is that the F +separation problem for any two regular languages recognized by ? can be
reduced to the F separation problem for two regular languages over A?.
The alphabet A?, called alphabet of wellformed words of ?, is defined from ? : A+ ? S by:
A? = (E(S) ? S ? E(S)) ? (S ? E(S)) ? (E(S) ? S) ? S.
We will not be interested in all words of A?+, but only in those that are wellformed. A word
w ? A?+ is said to be wellformed if one of the following two properties holds:
w is a single letter s ? S,
w has length > 2 and is of the form
with fi = ei+1 for all 0 6 i 6 n.
(s0, f0)?(e1, s1, f1) ? ? ? (en, sn, fn)?(en+1, sn+1) ? (S?E(S))?(E(S)?S?E(S))??(E(S)?S)
I Fact 2. The set of wellformed words of A?+ is a regular language.
+
We now define a morphism ? : A? ? S as follows. If s ? S, we set ?(s) = s, if
(e, s) ? E(S) ? S, we set ?((e, s)) = es, if (s, e) ? S ? E(S), we set ?((s, e)) = se and if
(e, s, f ) ? E(S) ? S ? E(S), we set ?((e, s, f )) = esf .
Associated Language of Wellformed Words. To any language L ? A+ that is recognized
by ?, one associates a language of wellformed words L ? A?+:
L = w
? A?+  w is wellformed and ?(w ) ? ?(L) .
By definition, the language L ? A?+ is the intersection of the language of wellformed words
with ??1(?(L)). Therefore, it is immediate by Fact 2 that it is regular, more precisely:
I Fact 3. Let L ? A+ be recognized by ?. Then, the associated language of wellformed
words L ? A?+ is a regular language that one can effectively compute from a recognizer of L.
4
Logical Approach
In this section, we prove Theorem 1 from a logical perspective. We begin with presenting
our separation theorem, which will entail the membership theorem as a simple consequence.
I Theorem 4. Let F and F + be respectively the weak and strong variants of one of the
logical fragments in Table 1.
Let L, L0 be two languages recognized by a morphism ? : A+ ? S into a finite semigroup S.
Let L, L0 ? A?+ be the languages of wellformed words associated with L, L0, respectively.
Then L is F +separable from L0 iff L is F separable from L0.
Theorem 4 reduces F +separation to F separation. The latter was already known to be
decidable for several weak variants in Table 1, namely for FO(=) [15], FO2(<) [16], ?1(<) [4],
B?1(<) [4, 16] and ?2(<) [17]. Hence, we get the following corollary.
I Corollary 5. Let L, L0 be regular languages. Then the following problems are decidable:
whether L is FO(=, +1)separable from L0.
whether L is FO2(<, +1)separable from L0.
whether L is ?1(<, +1, min, max)separable from L0.
whether L is B?1(<, +1, min, max)separable from L0.
whether L is ?2(<, +1)separable from L0.
Notice that since the membership problem reduces to the separation problem, this also
gives a new proof that all these fragments have a decidable membership problem. This
is of particular interest for FO2(<, +1), B?1(<, +1, min, max) and ?2(<, +1) for which
the previous proofs, which can be found in, or derived from [22, 1, 14], [8], and [6, 13, 12]
respectively, are known to be quite involved. It turns out that for ?2(<, +1), we can do even
better and entirely avoid separation. Indeed, when F is expressive enough, Theorem 4 can
be used to prove a similar theorem for the membership problem.
I Theorem 6. Let F and F + be respectively the weak and strong variants of one of the
logical fragments in Table 1. Moreover, assume that for any alphabet of wellformed words,
the set of wellformed words over this alphabet is definable in F .
Let L be a language recognized by a morphism ? : A+ ? S into a finite semigroup S. Let
L ? A?+ be the language of wellformed words associated with L. Then L is definable in F +
iff L is definable in F .
Proof. Set K = A+ \ L and let K be the associated language of wellformed words. Observe
that by definition, K ? L is the set of all wellformed words.
If L is definable in F , then L is F separable from K, hence by Theorem 4, L is F
+separable from K, and so L is definable in F +. Conversely, if L is definable in F +, then L is
F +separable from K and by Theorem 4, L is F separable from K. Since K ? L is the set of
all wellformed words, L is the intersection of the separator with the set of all wellformed
words, which by hypothesis is also definable in F . Therefore, L is definable in F . J
Observe that being wellformed can be expressed in ?2(<): essentially, a word is
wellformed if for all pairs of positions, either there is a third one inbetween, or the labels of the
two positions are ?compatible?. Hence, among the fragments of Table 1, Theorem 6 applies
to all fragments including and above ?2(<) in the quantifier alternation hierarchy. While
such a transfer result was previously known [22, 13], the presentation and the proof are new.
In particular, since membership is known to be decidable for ?2(<) [12], B?2(<) [17] and
?3(<) [17], we obtain new and simpler proofs of the following results.
I Corollary 7. Given a regular language L, one can decide whether
L is definable by a ?2(<, +1) (resp. by a ?2(<, +1)) formula.
L is definable by a B?2(<, +1) formula.
L is definable by a ?3(<, +1) (resp. by a ?3(<, +1)) formula.
It remains to prove Theorem 4. We devote the rest of the section to this proof. An
important remark is that the proof of the right to left direction is constructive: we start with
an F formula that separates L from L0 and use it to construct an F + formula that separates
L from L0. Note that the argument is generic for all fragments we consider.
On the other hand, the other direction, namely Proposition 9 below, requires a specific
argument tailored to each fragment, which is a straightforward but tedious
EhrenfeuchtFra?ss? argument. Due to lack of space, we provide proofs of this proposition for each
fragment in the long version [19] of this paper.
4.1
From F +separation to F separation
We prove that if L is F +separable from L0, then L is F separable from L0. We actually
prove the contrapositive: if L is not F separable from L0, then L is not F +separable from L0.
We rely on a construction which, to any wellformed word u ? A?+ and any integer i > 0,
associates a canonical word du ei ? A+.
Canonical Word Associated to a Wellformed Word. To any s ? S, we associate an
arbitrarily chosen nonempty word dse ? A+ such that ?(dse) = s (which is possible since
? has been chosen surjective). Let i > 0. From a wellformed word u ? A?+, we build a
word du ei ? A+ as follows. If u = s ? S, then du ei = dse for all i. Otherwise, we have by
definition
u = (s0, e1)(e1, s1, e2) ? ? ? (en?1sn?1en)(en, sn).
For a natural i > 0, we set
du ei = ds0e de1ei ds1e de2ei ? ? ? den?1ei dsn?1e denei dsne .
Recall that ? is the morphism ? : A?+ ? S mapping u to s0e1s1 ? ? ? sn?1ensn. Since ej ? E(S)
for all j, it is immediate that ?(du ei) = ?(u ), hence we get the following fact:
I Fact 8. For every i > 0 and every wellformed word u ? A?+, we have u ? L (resp. u ? L0)
if and only if du ei ? L (resp u ? L0).
We now proceed with the proof. We use the classical preorders associated to fragments
of firstorder logic. The (quantifier) rank of a firstorder formula ? is the largest number of
quantifiers along a branch in the parse tree of ?. Given u, v ? A+, we write u 4k+1 v if any
F + formula of rank k that is satisfied by u is satisfied by v as well. Similarly, for u , v ? A?+,
we write u 4k v if any F formula of rank k that is satisfied by u is satisfied by v as well.
One can verify that 4k and 4k+1 are preorders, as well as the following standard fact:
L ? A+ is definable by an F + formula of rank k iff L = {u0  ?u ? L st. u 4k+1 u0} (1)
L ? A?+ is definable by an F formula of rank k iff L = {u 0  ?u ? L st. u 4k u 0}.
Note that when F and F + are closed under complement, then 4k and 4k+1 are actually
equivalence relations. We can now state the main proposition of this direction.
I Proposition 9. For any k ? N, there exist ` ? N and i ? N such that for any wellformed
words u , u 0 ? A?+ satisfying u 4` u 0, we have du ei 4k+1 du 0ei.
For all fragments of Table 1, Proposition 9 is proved using classical EhrenfeuchtFra?ss?
arguments. While each proof is specific, the underlying ideas are similar. We present these
proofs in the long version of this paper [19]. We finish the subsection by explaining how
Proposition 9 can be used to terminate the proof of the first direction of Theorem 4.
We argue by contrapositive: assume that L is not F separable from L0. By definition
this means that no language definable in F separates L from L0. In particular, for any `, the
language
{u 0  ?u ? L st. u 4` u 0},
which is definable in F by (1), cannot be a separator. Note that this language contains L.
Hence, for all ` ? N, there exist u ? L and u 0 ? L0 such that u 4` u 0. We deduce from
Proposition 9 and Fact 8 that for all k ? N, there exist u ? L and u0 ? L0 such that u 4k+1 u0.
It follows, again by (1), that L is not F +separable from L0, which terminates the proof.
4.2
From F separation to F +separation
We now prove that if L is F separable from L0, then L is F +separable from L0. We do so
by building an F +definable separator. This proof is this time entirely generic. We rely on a
construction that is dual to the one used previously: to any word w ? A+, we associate a
canonical wellformed word bwc ? A?+.
Canonical Wellformed Word Associated to a Word. To any word w of A+, we associate
a canonical wellformed word bwc ? A?+ such that ?(w) = ?(bwc). This construction is
adapted from [14] and is originally inspired by [22].
Fix an arbitrary order on the set E(S). For a position x of w, let ux ? A+ be the
infix of w obtained by keeping only positions x ? (S ? 1) to x. If position x ? (S ? 1)
does not exist, ux is just the prefix of w ending at x. A position x is said distinguished if
there exists an idempotent e ? E(S) such that ?(ux) ? e = ?(ux). Additionally, we always
define the rightmost position as distinguished, even if it does not satisfy the property. Set
x1 < ? ? ? < xn+1 as the distinguished positions in w, so that xn+1 is the rightmost position.
Let e1, . . . , en ? E(S) be such that for all 1 6 i 6 n ? 1, ei is the smallest idempotent such that ?(uxi ) ? ei = ?(uxi ).
If n = 0, i.e., if the only distinguished position is the rightmost one, set bwc = ?(w) ? A?.
Otherwise, we define bwc ? A?+ as the word:
bwc = (?(w0), e1) ? (e1, ?(w1), e2) ? ? ? (en?1, ?(wn?1), en) ? (en, ?(wn))
(2)
where w0 is the prefix of w ending at position x1, for all 1 6 i 6 n ? 1, wi is the infix of w
obtained by keeping positions xi + 1 to xi+1, and wn is the suffix of w starting at position
xn + 1. Note that by construction, bwc is wellformed.
The next statement follows from the definition of ?, and from the fact that by definition
of the words wi and of the chosen idempotents, we have ?(w0 ? ? ? wi)ei+1 = ?(w0 ? ? ? wi).
I Fact 10. For all w ? A+, we have ?(w) = ?(bwc). Therefore, w ? L iff bwc ? L and
w ? L0 iff bwc ? L0.
To any distinguished position xi in w, we now associate the position bxc = i in bwc. Our
main motivation for using this construction is its local canonicity, which is stated in the
following lemma.
I Lemma 11. Let w ? A+. Then we have the following properties:
(a) whether a position x is distinguished in w, and if so the label of position bxc in bwc only
depends on the infix of w of length 2S ending at position x. That is, if the infixes of
length 2S ending at x and y are equal, then x is distinguished iff so is y, and in that
case, the labels of bxc and byc in bwc are equal.
(b) the label of the last position of bwc only depends on the suffix of length 2S of w.
Proof. It is immediate that whether x is distinguished and if so the associated idempotent
only depends on the infix ux of length at most S ending at x. Therefore, to prove (a), it
suffices to show that all infixes wi used in (2) are of size at most S, or in other words, that
among S + 1 consecutive positions, at least one is distinguished. So let us consider an infix
a1 ? ? ? aS+1 of w of length S + 1. It is immediate from the pigeonhole principle that there
exist i < j such that ?(a1 ? ? ? ai) = ?(a1 ? ? ? aj) = ?(a1 ? ? ? ai) ? (?(ai+1 ? ? ? aj))?. Hence, the
position corresponding to ai is distinguished. The proof of the second assertion is similar. J
L is F +separable from L0. We can now construct our separator. The construction
follows from the next proposition.
I Proposition 12. Let K ? A?+ that can be defined using an F formula ?. Then there exists
an F + formula ? over alphabet A such that for every word w ? A+:
w = ? if and only if bwc = ?.
Proof. Proposition 12 follows from the following simple consequence of Lemma 11.
I Claim 13. For any a ? A? there exists a formula ?a (x) of F + with a free variable x, such
that for any w ? A+ and any position x of w, we have w = ?a (x) iff x is distinguished and
bxc has label a in bwc.
This claim holds since by Lemma 11, formula ?a (x) only needs to explore the neighborhood
of size 2S of x, which is trivially possible for all fragments F + we consider. To conclude
the proof of Proposition 12, it suffices to define ? as the formula constructed from ? by
restricting all quantifiers to positions that are distinguished and to replace all tests Pa (x)
by ?a (x). J
Algebraic Approach
We now present an algebraic version of Theorem 4: the operator V 7? V ? D preserves
decidability of separation.
We would like to emphasize again that the ideas behind this theorem are essentially the
same as for Theorem 4. In particular, proofs presented in the long version of this paper [19]
only rely on elementary notions, thus bypassing complex constructions usually used to prove
this kind of result, even if the statement itself requires some additional algebraic vocabulary.
The section is organized in three parts.
We first briefly recall how classes of languages corresponding to our logical fragments are
given an algebraic definition: for each fragment, an associated class of finite semigroups
(or monoids) V, a variety, has already been characterized, such that the class of languages
definable in the fragment is exactly the class of languages that are recognized by a
semigroup (or monoid) of V.
In the second part, we define what ?adding the successor relation? means in this context.
Given a variety V, this generally corresponds to considering a new variety built on top
of V via an operation called the semidirect product. This new variety is denoted V ? D.
Finally, in the last part, we state our main theorem: for any variety V, separability for
the variety V ? D reduces to separability for the variety V.
5.1
Varieties
A variety of semigroups (resp. monoids) is a class of finite semigroups (resp. monoids) closed
under three natural operations: finite direct product, subsemigroup (or submonoid), and
homomorphic image. A variety V defines a class of languages, also noted V, namely the class
of all of languages recognized by semigroups (resp. monoids) in V. There is an issue however:
all classes of languages defined in this way have to be closed under complement, since the
set of languages recognized by any semigroup is closed under complement. This prevents
us from capturing logical fragments that are not closed under complement, such as ?2(<).
This problem has been solved in [11] with the notions of ordered semigroups and monoids.
Intuitively, such a semigroup is parametrized by a partial order and the set of languages it
recognizes is then restricted with respect to this partial order. These classical constructions
will be recalled in the long version of this paper [19], as well as varieties corresponding to all
fragments we deal with.
All logical fragments presented in Section 2 correspond to varieties that have been
fully identified. For each fragment, its nonenriched variant corresponds to a variety V of
(ordered) monoids and its enriched version to the variety of (ordered) semigroups V ? D built
from V. For example,the fragment FO2(<) corresponds to the variety of monoids DA and
the fragment FO2(<, +1) to the variety of semigroups DA ? D [23] (see the long version [19]
for a bibliography with all correspondences).
5.2
Semidirect Product
The Variety D. The variety D consists of all finite ordered semigroups S such that for
all s ? S and all e ? E(S), we have se = e. From a language perspective, a language L is
recognized by a semigroup in D iff there exists k ? N such that membership of a word w
to L only depends on the suffix of length k of w.
Semidirect Product. Let M be an ordered monoid and let T be an ordered semigroup. A
semidirect product of M and T is an operation that is parametrized by an action of T on M
and outputs a new ordered semigroup, whose base set is M ? T . Therefore, one can obtain
different semidirect products out of the same M and T , depending on the chosen action (we
recall the construction in the long version [19]). One can next lift this product at the level of
varieties.
We are interested in the semidirect products of the form V ? D, the variety of ordered
semigroups generated by all semidirect products of an ordered monoid of V by an ordered
semigroup of D. The reason why we introduce such semidirect products is the following
theorem, which gathers several nontrivial results from the literature. The reader is referred
to the long version of this paper [19] for details.
I Theorem 14. Let V be a variety corresponding to a fragment F from the ones presented
in Table 1. Then, the variety corresponding to the fragment F + is V ? D.
5.3
Main Theorem
We have now the machinery needed to state our main theorem. For any variety of ordered
monoids V, we reduce (V ? D)separability to Vseparability.
I Theorem 15. Let V be a nontrivial variety of ordered monoids. Let L and L0 be two
languages both recognized by the same morphism ? : A+ ? S into a finite semigroup S. Set
L, L0 ? A?+ as the languages of wellformed words associated to L, L0, respectively. Then, L
is (V ? D)separable from L0 if and only if L is Vseparable from L0.
The proof of Theorem 15 is presented in the full version of this paper [19]. As it was the
case for Theorem 4, the proof is both elementary and constructive: if there exists a separator
for L and L0 in V, we use it to construct a separator for L and L0 in V ? D.
In view of Theorem 14, Theorem 15 applies to all fragments we introduced. This means
that Theorem 4 can be given an alternate indirect proof within this algebraic framework by
combining Theorem 15 and Theorem 14. Hence, this also yields another proof of Corollary 5.
6
Conclusion
We proved that separation is decidable over finite words for the following logical fragments:
FO(=, +1), ?1(<, +1, min, max), B?1(<, +1, min, max), ?2(<, +1) and FO2(<, +1). To
achieve this, we presented a simple reduction to the same problem for the weaker fragments
FO(=), ?1(<), B?1(<), ?2(<) and FO2(<).
The reduction itself is entirely generic to all fragments and its proof is elementary, and
also mostly generic. In particular, the technique can be used to prove that the reduction
works for other natural fragments of firstorder logic. An interesting example to which
these results apply is the quantifier alternation hierarchy within FO2(<) (known as the
TrotterWeil hierarchy, and which is decidable [25]). However, the separation problem for
classes in this hierarchy has yet to be investigated. We also obtained direct proofs that
membership is decidable for B?2(<, +1) and ?3(<, +1).
Finally, we presented an algebraic formulation of this reduction, which recovers a previously
known result by Steinberg [21], while having a much simpler proof. One can expect extending
these results to other fragments, such as enrichment with modulo predicates. Another
advantage of this technique is that it can be extended in a straightforward way to the same
logical fragments over words of infinite length. This yields identical transfer results. We
leave the presentation of these results for further work.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Jorge Almeida . A syntactical proof of locality of DA . International Journal on Algebra and Computation , 6 : 165  177 , 1996 .
Jorge Almeida . Some algorithmic problems for pseudovarieties . Publicationes Mathematicae Debrecen , 54 : 531  552 , 1999 . Proc. of Automata and Formal Languages , VIII.
International Journal on Algebra and Computation , 20 ( 2 ): 181  188 , 2010 .
Wojciech Czerwi?ski , Wim Martens, and Tom?? Masopust . Efficient separability of regular languages by subsequences and suffixes . In Proceedings of the 40th International Colloquium on Automata, Languages, and Programming , ICALP'13 , volume 7966 of Lecture Notes in Computer Science, pages 150  161 . Springer, 2013 .
Volker Diekert and Paul Gastin . Firstorder definable languages . In Logic and Automata: History and Perspectives , volume 2 , pages 261  306 . Amsterdam University Press, 2008 .
Christian Gla?er and Heinz Schmitz . Languages of dotdepth 3/2. Theory of Computing Systems , 42 ( 2 ): 256  286 , 2008 .
Karsten Henckell . Pointlike sets: the finest aperiodic cover of a finite semigroup . Journal of Pure and Applied Algebra , 55 ( 12 ): 85  126 , 1988 .
Robert Knast . A semigroup characterization of dotdepth one languages . Rairo Informatique Th?orique et Applications , 17 ( 4 ): 321  330 , 1983 .
Manfred Kufleitner and Alexander Lauser . Around dotdepth 1 . International Journal of Foundations of Computer Science , 23 ( 6 ): 1323  1340 , 2012 .
Robert McNaughton and Seymour Papert . CounterFree Automata . MIT Press, 1971 .
Jean?ric Pin . A variety theorem without complementation . Russian Mathematics , ( Izvestija vuzov . Matematika) , 39 : 80  90 , 1995 .
Jean?ric Pin and Pascal Weil . Polynomial closure and unambiguous product . Theory of Computing Systems , 30 ( 4 ): 383  422 , 1997 .
Communications in Algebra , 30 : 5677  5713 , 2002 .
Thomas Place and Luc Segoufin . Decidable characterization of FO2(<, +1) and locality of DA . Unpublished, to appear, 2014 .
Thomas Place , Lorijn van Rooijen, and Marc Zeitoun . Separating regular languages by locally testable and locally threshold testable languages . In Proceedings of the 34th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, FSTTCS'13 , volume 24 of LIPIcs , pages 363  375 . Schloss Dagstuhl  LeibnizZentrum fuer Informatik, 2013 .
Thomas Place , Lorijn van Rooijen, and Marc Zeitoun . Separating regular languages by piecewise testable and unambiguous languages . In Proceedings of the 28th MFCS'13 , volume 8087 of Lecture Notes in Computer Science, pages 729  740 . Springer, 2013 .
Thomas Place and Marc Zeitoun . Going higher in the firstorder quantifier alternation hierarchy on words . In Proceedings of the 41th International Colloquium on Automata, Languages, and Programming , ICALP'14 , volume 8573 of Lecture Notes in Computer Science, pages 342  353 , 2014 . http://arxiv.org/pdf/1404.6832v1.
Thomas Place and Marc Zeitoun . Separating regular languages with firstorder logic . In Proceedings of the Joint Meeting of the 23rd EACSL Annual Conference on Computer Science Logic (CSL'14) and the 29th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS'14) , 2014 .
Thomas Place and Marc Zeitoun . A transfer theorem for the separation problem . CoRR, abs/1501.00569 , 2015 . http://arxiv.org/abs/1501.00569.
MarcelPaul Sch?tzenberger . On finite monoids having only trivial subgroups . Information and Control , 8 : 190  194 , 1965 .
Benjamin Steinberg . A delay theorem for pointlikes . Semigroup Forum , 63 ( 3 ): 281  304 , 2001 .
Howard Straubing . Finite semigroup varieties of the form V ? D . Journal of Pure and Applied Algebra , 36 : 53  94 , 1985 .
Denis Th?rien and Thomas Wilke . Over words, two variables are as powerful as one quantifier alternation . In Proceedings of the 30th Annual ACM Symposium on Theory of Computing , STOC' 98 , pages 234  240 . ACM, 1998 .
Wolfgang Thomas . Classifying regular events in symbolic logic . Journal of Computer and System Sciences , 25 ( 3 ): 360  376 , 1982 .
Manfred Kufleitner Pascal Weil . On logical hierarchies within FO2definable languages .
Logical Methods in Computer Science , 8 ( 3 ), 2012 .