Separation and the Successor Relation

LIPICS - Leibniz International Proceedings in Informatics, Feb 2015

We investigate two problems for a class C of regular word languages. The C-membership problem asks for an algorithm to decide whether an input language belongs to C. The C-separation problem asks for an algorithm that, given as input two regular languages, decides whether there exists a third language in C containing the first language, while being disjoint from the second. These problems are considered as means to obtain a deep understanding of the class C. It is usual for such classes to be defined by logical formalisms. Logics are often built on top of each other, by adding new predicates. A natural construction is to enrich a logic with the successor relation. In this paper, we obtain new and simple proofs of two transfer results: we show that for suitable logically defined classes, the membership, resp. the separation problem for a class enriched with the successor relation reduces to the same problem for the original class. Our reductions work both for languages of finite words and infinite words. The proofs are mostly self-contained, and only require a basic background on regular languages. This paper therefore gives simple proofs of results that were considered as difficult, such as the decidability of the membership problem for the levels 1, 3/2, 2 and 5/2 of the dot-depth hierarchy.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

http://drops.dagstuhl.de/opus/volltexte/2015/4949/pdf/49.pdf

Separation and the Successor Relation

S TA C S Separation and the Successor Relation? Thomas Place 0 Marc Zeitoun 0 0 LaBRI, Bordeaux University , France We investigate two problems for a class C of regular word languages. The C-membership problem asks for an algorithm to decide whether an input language belongs to C. The C-separation problem asks for an algorithm that, given as input two regular languages, decides whether there exists a third language in C containing the first language, while being disjoint from the second. These problems are considered as means to obtain a deep understanding of the class C. It is usual for such classes to be defined by logical formalisms. Logics are often built on top of each other, by adding new predicates. A natural construction is to enrich a logic with the successor relation. In this paper, we obtain new and simple proofs of two transfer results: we show that for suitable logically defined classes, the membership, resp. the separation problem for a class enriched with the successor relation reduces to the same problem for the original class. Our reductions work both for languages of finite words and infinite words. The proofs are mostly self-contained, and only require a basic background on regular languages. This paper therefore gives simple proofs of results that were considered as difficult, such as the decidability of the membership problem for the levels 1, 3/2, 2 and 5/2 of the dot-depth hierarchy. A central problem in the theory of formal languages is to characterize and understand the expressive power of high level specification formalisms. Monadic second order logic (MSO) is such a formalism, which is both expressive and robust. For several classes of structures, such as words or trees, it has the same expressive power as finite automata and defines the class of regular languages. In this paper, we investigate fragments of MSO over words. In this context, understanding the expressive power of a fragment is associated to two decision problems: the membership problem and the separation problem. For a fixed logical fragment F , the F -membership problem asks for a decision procedure that tests whether some input regular language can be expressed by a formula from F . To obtain such an algorithm, one has to consider and understand all properties that can be expressed within F , which requires a deep understanding of the fragment F . On the other hand, the F -separation problem is more general. It asks for a decision procedure that tests whether given two input regular languages, there exists a third one in F containing the first language while being disjoint from the second one. Since regular languages are closed under complement, membership reduces to separation: a language is in F if and only if it can be separated from its complement. Usually, the and phrases separation problem; regular word languages; logics; decidable characterizations; semidirect product - 1998 ACM Subject Classification F.4.3 Formal Languages Introduction separation problem is more difficult than the membership problem but also more rewarding with respect to the knowledge gained on the investigated fragment F . These two problems have been considered and solved for many natural fragments of monadic second order logic. Among these, the most prominent one is first-order logic, FO(<), equipped with a predicate < for the linear ordering. The solution to the membership problem, known as the McNaughton-Papert-Sch?tzenberger Theorem [20, 10], has been revisited until recently [5]. The theorem states that a regular language is definable in FO(<) if and only if its syntactic semigroup is aperiodic. The syntactic semigroup is a finite algebraic object that can be computed from any regular language. Since aperiodicity can be defined as an equation that needs to be satisfied by all of its elements, this yields decidability of FO(<)-definability. This result now serves as a template, which is commonly followed in this line of research. The separation problem has also been successfully solved for first-order logic [7]. Actually, the problem was first addressed in a purely algebraic framework, and was later identified as equivalent to our separation problem [2]. As for membership, this problem is still revisited today and a new self-contained and combinatorial proof was obtained in [18]. Motivation. We are interested in natural fragments of FO(<) obtained by restricting either the number of variables or the number of quantifier alternations allowed in formulas. Such restrictions in general give rise to several variants of the same fragment. Indeed, in most cases, the drop in expressive power forbids the use of natural relations that could be defined from the linear order in FO(<). The main example considered in this paper is +1, the successor relation, together with predicates min and max for the first and last positions in a word. This means that one can define two distinct variants of the same fragment depending on whether we decide to explicitly add these predicates in the signature or not. An example is the fragment ?n, which consists of first-order formulas whose prenex normal form has at most (n ? 1) quantifier alternations and starts with an existential block. Since defining +1 requires an additional quantifier alternation, ?n(<, +1, min, max) has indeed stronger expressiveness than ?n(<). The motivation of this paper is to obtain decidability results for such enriched fragments. State of the Art. Even when the weak fragment is known to have decidable membership, proving that the enriched one has the same property can be nontrivial. Examples include the membership proofs of B?1(<, +1, min, max) (Boolean combinations of ?1(<, +1, min, max) formulas) and ?2(<, +1), which require difficult and intricate combinatorial arguments [8, 6, 9] or a wealth of algebraic machinery [12, 13]. Another issue is that most proofs directly deal with the enriched fragment. Given the jungle of such logical fragments, it is desirable to avoid such an approach, treating each variant of the same fragment independently. Instead, a satisfying approach is to first obtain a solution of the membership and separation problems for the less expressive variant and then to lift it to other variants via a generic transfer result. This approach has first been investigated by Straubing for the membership problem [22] in an algebraic framework, and later adapted to be able to treat classes not closed under complement [13]. Transferring the logical problem to this algebraic framework requires preliminary steps, still specific to the investigated class, to prove that: 1. A language is definable in the fragment if and only if its syntactic semigroup belongs to a specific algebraic variety V (e.g., the variety of aperiodic monoids for FO(<)), and 2. Membership to V is decidable. Next, though this is not immediate, for most fragments of FO(<), it has been proved that 3. When the weaker variant corresponds to a variety V, the variant with successor corresponds to the variety V ? D, built generically from V. Hence, Straubing?s approach was to prove that 4. the operator V 7? V ? D preserves decidability. Unfortunately, this is not true in general [3]. Actually, while decidability is preserved for all known logical fragments, there is no generic result that captures them all. In particular, for the less expressive fragments, one has to use completely ad hoc proofs. In the separation setting, things behave well: it has been shown that decidability of separation is preserved by the operation V 7? V ? D [21]. While interesting when already starting from algebra, this approach has several downsides: Dealing with algebra hides the logical intuitions, while our primary goal is to understand the expressiveness of logics. Going from logic to algebra requires to be acquainted with new notions and vocabulary, as well as involved theoretical tools. Proofs are also often nontrivial and require a deep understanding of complex objects, which may be scattered in the bibliography. Despite step 4, which is generic to some extent, arguments specific to the investigated class are pushed to steps 1?3, and they are often nontrivial. Contributions. We give a new proof that decidability of separation can be transferred from a weak to an enriched fragment. We present the result in two different forms. The first one is non-algebraic: we work directly with the logical fragments, without using varieties. The transfer result is generic and its proof mostly is: the only specific argument is an Ehrenfeucht-Fra?ss? game that can be adapted to all natural fragments with minimal difficulty (we prove it in the long version of this paper for all considered fragments, see [19]). The benefits of this new proof are that: 1. It is self-contained and much simpler than previous ones. It only relies on two basic well-known notions: recognizability by semigroups and Ehrenfeucht-Fra?ss? games. 2. It works with classes that are not closed under complement, contrary to [21]. This allows us to capture the ? and ? levels in the quantifier alternation hierarchy of first-order logic. 3. Under an additional hypothesis on the logical fragment, which is met for most fragments we investigate and easy to check, the decidability result of the separation problem also extends to the membership problem. 4. The proof adapts smoothly to infinite words using the notion of ?-semigroups, as shown in the long version of this paper [19]. The second form of our result is algebraic and generic. We prove that V 7? V ? D preserves the decidability of separation for varieties, hence giving an elementary proof of a result of [21]. Even in this algebraic form, we completely bypass involved constructions or notions, such as pointlike sets for categories developed in [21], thus making the proof accessible. As corollaries, since B?1(<) and ?2(<) both enjoy decidable separation [4, 16, 17], we obtain that this is also the case for the fragments B?1(<, +1, min, max) and ?2(<, +1), known as levels 1 and 3/2 of the dot-depth hierarchy. These new results strengthen the previous ones [8, 6] that showed decidability of membership and were considered as difficult. We actually obtain that separation for ?n(<, +1, min, max) reduces to separation for ?n(<). Since we also transfer decidability of the membership problem, and since the fragments B?2(<) of Boolean combinations of ?2(<) formulas and ?3(<) have decidable membership [17] we deduce that the same holds for B?2(<, +1) and ?3(<, +1), known as levels 2 and 5/2 of the dot-depth hierarchy. Organization of the Paper. In Section 2, we set up the notation and we present the separation problem and the logics we deal with. Section 3 is devoted to our main tool: languages of well-formed words. In Section 4, we use it to prove our transfer result for all fragments from the logical perspective, and in Section 5, we show that decidability of the separation problem for the variety V entails the same for V ? D. 2 Preliminaries In this section, we provide preliminary definitions on regular languages defined by logical fragments and on separation. We also present our main contribution. Words, Languages. We fix a finite alphabet A. Let A+ be the set of all nonempty finite words and let A? be the set of all finite words over A. If u, v are words, we denote by u ? v or by uv the word obtained by concatenating u and v. For convenience, we only consider, without loss of generality, languages that do not contain the empty word. That is, a language is a subset of A+. We work with regular languages, that is, languages definable by finite automata. Separation. Given three languages K, L, L0, we say that K separates L from L0 if L ? K and K ? L0 = ?. If C is a class of languages, we say that L is C-separable from L0 if there exists K ? C that separates L from L0. Note that if C is closed under complement, L is C-separable from L0 if and only if L0 is C-separable from L. However, this is not true for a class C not closed under complement, such as the classes ?n(<) of the quantifier alternation hierarchy, which we shall consider. Given a class C, the C-separation problem asks for an algorithm which, given as input two regular languages L, L0, decides whether L is C-separable from L0. The C-membership problem, which asks whether an input regular language belongs to C, reduces to the C-separation problem, as a regular language belongs to C iff it is C-separable from its complement. Logics. We investigate several fragments of first-order logic on finite words. We view a finite word as a logical structure made of a sequence of positions labeled over A. We work with first-order logic FO(<) using a unary predicate Pa for each a ? A, which selects positions labeled with an a, as well as binary predicates ?=? for equality and ?<? for the linear order. Such a formula defines the regular language of all words that satisfy it. We will freely use the name of a logical fragment of FO(<) to denote the class of languages definable in this fragment. Observe that FO(<) is powerful enough to express the following logical relations: First position, min(x): Last position, max(x): Successor, y = x + 1: ?y ?(y < x). ?y ?(x < y). x < y ? ?(?z x < z ? z < y). However, for most fragments of FO(<) this is not the case. For example, in the twovariables restriction FO2(<) of FO(<), it is not possible to express successor, as it requires quantifying over a third variable. For these fragments F , adding the predicates min, max and +1 yields a strictly more powerful logic F +. Our goal is to prove a transfer result for such fragments: given a fragment, if the separation problem is decidable for the weak variant F , then it is decidable as well for the strong variant F + obtained by enriching F with the above relations. The technique is generic, meaning that it is not bound to a particular logic. In particular, our transfer result applies to the following well-known logical fragments: FO(=), the restriction of FO(<) in which the linear order cannot be used, and only equality between two positions can be tested. The enriched fragment FO(=, +1) (min and max can be eliminated from the formulas) defines locally threshold testable languages [24]. All levels in the quantifier alternation hierarchy of first-order logic. A first-order formula is ?n(<) (resp. ?n(<)) if its prenex normal form contains at most (n ? 1) quantifier alternations and starts with an ? (resp. a ?) quantifier block. Finally, a B?n(<) formula is a boolean combination of ?n(<) and ?n(<) formulas. Since for all fragments above ?2(<), a formula involving min and max can be expressed without these predicates in the same logic, we shall denote the enriched fragments by ?1(<, +1, min, max), B?1(<, +1, min, max), and then by ?2(<, +1), B?2(<, +1), . . . FO2(<), the restriction of FO(<) using only two reusable variables. The corresponding enriched fragment is FO2(<, +1), since min and max can again be eliminated from the formulas. I Theorem 1. Let F and F + be respectively the weak and strong variants of one of the logical fragments in Table 1. Then F +-separability can be effectively reduced to F -separability. As explained in the introduction, we prove this theorem in two flavors: the first one, Theorem 4, is purely logical. It is self-contained and elementary, but is not entirely generic. The other one, Theorem 15, is purely algebraic and generic: the transfer works from an algebraic class (for which only fairly general restrictions are assumed) to an enriched one. Yet, it relies on already established results to be instantiated on the fragments of Table 1. All these logical fragments have a rich history and have been extensively studied in the literature. In particular, the separation problem is known to be decidable for the following fragments: FO(=), FO2(<), ?1(<), B?1(<), ?2(<) [4, 16, 17]. This means that, from our results, we obtain decidability of separation for FO(=, +1), FO2(<, +1), ?1(<, +1, min, max), B?1(<, +1, min, max) and ?2(<, +1). Note that for FO(=, +1), FO2(<, +1) and B?1(<, +1, min, max), the results could already be obtained as corollaries of algebraic theorems of Steinberg [21] and Almeida [2]. As explained in the introduction, an issue with this approach is that the proof of Steinberg?s result relies on deep algebraic arguments and is not tailored to separation (the connection with separation is made by Almeida [2]). For ?1(<, +1, min, max) and ?2(<, +1), the result is new, as Steinberg?s result does not apply to classes of languages that are not closed under complement. Tools: Semigroups and Well-Formed Words In this section, we define the main tools used in the paper. First, we recall the well-known semigroup based definition of regular languages: a language is regular if and only if it can be recognized by a finite semigroup. Our second tool, well-formed words, is specific to our problem and plays a key role in our transfer result. 3.1 Semigroups and Monoids We work with the algebraic representation of regular languages in terms of semigroups. A semigroup is a set S equipped with an associative product, written s ? t or st. A monoid is a semigroup S having a neutral element 1S, i.e., such that s ? 1S = 1S ? s = s for all s ? S. If S is a semigroup, then S1 denotes the monoid S ? {1S} where 1S ?/ S is a new element, acting as neutral element. Note that we add such a new identity even if S is already a monoid. An element e ? S is idempotent if e ? e = e. We denote by E(S) the set of idempotents of S. Given a finite semigroup S, it is folklore and easy to see that there is an integer ?(S) (denoted by ? when S is understood) such that for all s of S, s? is idempotent: s? = s?s?. Note that A+ and A? equipped with concatenation are respectively a semigroup and a monoid called the free semigroup over A and the free monoid over A. Let L ? A+ be a language and S be a semigroup (resp. a monoid). We say that L is recognized by S if there exist a morphism ? : A+ ? S (resp. ? : A? ? S) and a set F ? S such that L = ??1(F ). Semigroups and Separation. The separation problem takes as input two regular languages L, L0. It is convenient to work with a single object recognizing both of them, rather than having to deal with two. Let S, S0 be semigroups recognizing L, L0 together with the associated morphisms ?, ?0, respectively. Clearly, L and L0 are both recognized by S ? S0 with the morphism ? ? ?0 : A+ ? S ? S0 mapping w to (?(w), ?0(w)). From now on, we work with such a single semigroup recognizing both languages. Replacing S ? S0 with its image under ? ? ?0, one can also assume that this morphism is surjective. To sum up, we assume from now on, w.l.o.g., that L and L0 are recognized by a single surjective morphism. 3.2 Well-Formed Words In this section, we define our main tool for this paper. Assume that F is the weak variant of one of the logical fragments of Table 1 and let F + be the corresponding enriched variant. To any semigroup morphism ? : A+ ? S into a finite semigroup S, we associate a new alphabet A? called the alphabet of well-formed words. The main intuition behind this notion is that the F +-separation problem for any two regular languages recognized by ? can be reduced to the F -separation problem for two regular languages over A?. The alphabet A?, called alphabet of well-formed words of ?, is defined from ? : A+ ? S by: A? = (E(S) ? S ? E(S)) ? (S ? E(S)) ? (E(S) ? S) ? S. We will not be interested in all words of A?+, but only in those that are well-formed. A word w ? A?+ is said to be well-formed if one of the following two properties holds: w is a single letter s ? S, w has length > 2 and is of the form with fi = ei+1 for all 0 6 i 6 n. (s0, f0)?(e1, s1, f1) ? ? ? (en, sn, fn)?(en+1, sn+1) ? (S?E(S))?(E(S)?S?E(S))??(E(S)?S) I Fact 2. The set of well-formed words of A?+ is a regular language. + We now define a morphism ? : A? ? S as follows. If s ? S, we set ?(s) = s, if (e, s) ? E(S) ? S, we set ?((e, s)) = es, if (s, e) ? S ? E(S), we set ?((s, e)) = se and if (e, s, f ) ? E(S) ? S ? E(S), we set ?((e, s, f )) = esf . Associated Language of Well-formed Words. To any language L ? A+ that is recognized by ?, one associates a language of well-formed words L ? A?+: L = w ? A?+ | w is well-formed and ?(w ) ? ?(L) . By definition, the language L ? A?+ is the intersection of the language of well-formed words with ??1(?(L)). Therefore, it is immediate by Fact 2 that it is regular, more precisely: I Fact 3. Let L ? A+ be recognized by ?. Then, the associated language of well-formed words L ? A?+ is a regular language that one can effectively compute from a recognizer of L. 4 Logical Approach In this section, we prove Theorem 1 from a logical perspective. We begin with presenting our separation theorem, which will entail the membership theorem as a simple consequence. I Theorem 4. Let F and F + be respectively the weak and strong variants of one of the logical fragments in Table 1. Let L, L0 be two languages recognized by a morphism ? : A+ ? S into a finite semigroup S. Let L, L0 ? A?+ be the languages of well-formed words associated with L, L0, respectively. Then L is F +-separable from L0 iff L is F -separable from L0. Theorem 4 reduces F +-separation to F -separation. The latter was already known to be decidable for several weak variants in Table 1, namely for FO(=) [15], FO2(<) [16], ?1(<) [4], B?1(<) [4, 16] and ?2(<) [17]. Hence, we get the following corollary. I Corollary 5. Let L, L0 be regular languages. Then the following problems are decidable: whether L is FO(=, +1)-separable from L0. whether L is FO2(<, +1)-separable from L0. whether L is ?1(<, +1, min, max)-separable from L0. whether L is B?1(<, +1, min, max)-separable from L0. whether L is ?2(<, +1)-separable from L0. Notice that since the membership problem reduces to the separation problem, this also gives a new proof that all these fragments have a decidable membership problem. This is of particular interest for FO2(<, +1), B?1(<, +1, min, max) and ?2(<, +1) for which the previous proofs, which can be found in, or derived from [22, 1, 14], [8], and [6, 13, 12] respectively, are known to be quite involved. It turns out that for ?2(<, +1), we can do even better and entirely avoid separation. Indeed, when F is expressive enough, Theorem 4 can be used to prove a similar theorem for the membership problem. I Theorem 6. Let F and F + be respectively the weak and strong variants of one of the logical fragments in Table 1. Moreover, assume that for any alphabet of well-formed words, the set of well-formed words over this alphabet is definable in F . Let L be a language recognized by a morphism ? : A+ ? S into a finite semigroup S. Let L ? A?+ be the language of well-formed words associated with L. Then L is definable in F + iff L is definable in F . Proof. Set K = A+ \ L and let K be the associated language of well-formed words. Observe that by definition, K ? L is the set of all well-formed words. If L is definable in F , then L is F -separable from K, hence by Theorem 4, L is F +separable from K, and so L is definable in F +. Conversely, if L is definable in F +, then L is F +-separable from K and by Theorem 4, L is F -separable from K. Since K ? L is the set of all well-formed words, L is the intersection of the separator with the set of all well-formed words, which by hypothesis is also definable in F . Therefore, L is definable in F . J Observe that being well-formed can be expressed in ?2(<): essentially, a word is wellformed if for all pairs of positions, either there is a third one in-between, or the labels of the two positions are ?compatible?. Hence, among the fragments of Table 1, Theorem 6 applies to all fragments including and above ?2(<) in the quantifier alternation hierarchy. While such a transfer result was previously known [22, 13], the presentation and the proof are new. In particular, since membership is known to be decidable for ?2(<) [12], B?2(<) [17] and ?3(<) [17], we obtain new and simpler proofs of the following results. I Corollary 7. Given a regular language L, one can decide whether L is definable by a ?2(<, +1) (resp. by a ?2(<, +1)) formula. L is definable by a B?2(<, +1) formula. L is definable by a ?3(<, +1) (resp. by a ?3(<, +1)) formula. It remains to prove Theorem 4. We devote the rest of the section to this proof. An important remark is that the proof of the right to left direction is constructive: we start with an F formula that separates L from L0 and use it to construct an F + formula that separates L from L0. Note that the argument is generic for all fragments we consider. On the other hand, the other direction, namely Proposition 9 below, requires a specific argument tailored to each fragment, which is a straightforward but tedious EhrenfeuchtFra?ss? argument. Due to lack of space, we provide proofs of this proposition for each fragment in the long version [19] of this paper. 4.1 From F +-separation to F -separation We prove that if L is F +-separable from L0, then L is F -separable from L0. We actually prove the contrapositive: if L is not F -separable from L0, then L is not F +-separable from L0. We rely on a construction which, to any well-formed word u ? A?+ and any integer i > 0, associates a canonical word du ei ? A+. Canonical Word Associated to a Well-formed Word. To any s ? S, we associate an arbitrarily chosen nonempty word dse ? A+ such that ?(dse) = s (which is possible since ? has been chosen surjective). Let i > 0. From a well-formed word u ? A?+, we build a word du ei ? A+ as follows. If u = s ? S, then du ei = dse for all i. Otherwise, we have by definition u = (s0, e1)(e1, s1, e2) ? ? ? (en?1sn?1en)(en, sn). For a natural i > 0, we set du ei = ds0e de1ei ds1e de2ei ? ? ? den?1ei dsn?1e denei dsne . Recall that ? is the morphism ? : A?+ ? S mapping u to s0e1s1 ? ? ? sn?1ensn. Since ej ? E(S) for all j, it is immediate that ?(du ei) = ?(u ), hence we get the following fact: I Fact 8. For every i > 0 and every well-formed word u ? A?+, we have u ? L (resp. u ? L0) if and only if du ei ? L (resp u ? L0). We now proceed with the proof. We use the classical preorders associated to fragments of first-order logic. The (quantifier) rank of a first-order formula ? is the largest number of quantifiers along a branch in the parse tree of ?. Given u, v ? A+, we write u 4k+1 v if any F + formula of rank k that is satisfied by u is satisfied by v as well. Similarly, for u , v ? A?+, we write u 4k v if any F formula of rank k that is satisfied by u is satisfied by v as well. One can verify that 4k and 4k+1 are preorders, as well as the following standard fact: L ? A+ is definable by an F + formula of rank k iff L = {u0 | ?u ? L st. u 4k+1 u0} (1) L ? A?+ is definable by an F formula of rank k iff L = {u 0 | ?u ? L st. u 4k u 0}. Note that when F and F + are closed under complement, then 4k and 4k+1 are actually equivalence relations. We can now state the main proposition of this direction. I Proposition 9. For any k ? N, there exist ` ? N and i ? N such that for any well-formed words u , u 0 ? A?+ satisfying u 4` u 0, we have du ei 4k+1 du 0ei. For all fragments of Table 1, Proposition 9 is proved using classical Ehrenfeucht-Fra?ss? arguments. While each proof is specific, the underlying ideas are similar. We present these proofs in the long version of this paper [19]. We finish the subsection by explaining how Proposition 9 can be used to terminate the proof of the first direction of Theorem 4. We argue by contrapositive: assume that L is not F -separable from L0. By definition this means that no language definable in F separates L from L0. In particular, for any `, the language {u 0 | ?u ? L st. u 4` u 0}, which is definable in F by (1), cannot be a separator. Note that this language contains L. Hence, for all ` ? N, there exist u ? L and u 0 ? L0 such that u 4` u 0. We deduce from Proposition 9 and Fact 8 that for all k ? N, there exist u ? L and u0 ? L0 such that u 4k+1 u0. It follows, again by (1), that L is not F +-separable from L0, which terminates the proof. 4.2 From F -separation to F +-separation We now prove that if L is F -separable from L0, then L is F +-separable from L0. We do so by building an F +-definable separator. This proof is this time entirely generic. We rely on a construction that is dual to the one used previously: to any word w ? A+, we associate a canonical well-formed word bwc ? A?+. Canonical Well-formed Word Associated to a Word. To any word w of A+, we associate a canonical well-formed word bwc ? A?+ such that ?(w) = ?(bwc). This construction is adapted from [14] and is originally inspired by [22]. Fix an arbitrary order on the set E(S). For a position x of w, let ux ? A+ be the infix of w obtained by keeping only positions x ? (|S| ? 1) to x. If position x ? (|S| ? 1) does not exist, ux is just the prefix of w ending at x. A position x is said distinguished if there exists an idempotent e ? E(S) such that ?(ux) ? e = ?(ux). Additionally, we always define the rightmost position as distinguished, even if it does not satisfy the property. Set x1 < ? ? ? < xn+1 as the distinguished positions in w, so that xn+1 is the rightmost position. Let e1, . . . , en ? E(S) be such that for all 1 6 i 6 n ? 1, ei is the smallest idempotent such that ?(uxi ) ? ei = ?(uxi ). If n = 0, i.e., if the only distinguished position is the rightmost one, set bwc = ?(w) ? A?. Otherwise, we define bwc ? A?+ as the word: bwc = (?(w0), e1) ? (e1, ?(w1), e2) ? ? ? (en?1, ?(wn?1), en) ? (en, ?(wn)) (2) where w0 is the prefix of w ending at position x1, for all 1 6 i 6 n ? 1, wi is the infix of w obtained by keeping positions xi + 1 to xi+1, and wn is the suffix of w starting at position xn + 1. Note that by construction, bwc is well-formed. The next statement follows from the definition of ?, and from the fact that by definition of the words wi and of the chosen idempotents, we have ?(w0 ? ? ? wi)ei+1 = ?(w0 ? ? ? wi). I Fact 10. For all w ? A+, we have ?(w) = ?(bwc). Therefore, w ? L iff bwc ? L and w ? L0 iff bwc ? L0. To any distinguished position xi in w, we now associate the position bxc = i in bwc. Our main motivation for using this construction is its local canonicity, which is stated in the following lemma. I Lemma 11. Let w ? A+. Then we have the following properties: (a) whether a position x is distinguished in w, and if so the label of position bxc in bwc only depends on the infix of w of length 2|S| ending at position x. That is, if the infixes of length 2|S| ending at x and y are equal, then x is distinguished iff so is y, and in that case, the labels of bxc and byc in bwc are equal. (b) the label of the last position of bwc only depends on the suffix of length 2|S| of w. Proof. It is immediate that whether x is distinguished and if so the associated idempotent only depends on the infix ux of length at most |S| ending at x. Therefore, to prove (a), it suffices to show that all infixes wi used in (2) are of size at most |S|, or in other words, that among |S| + 1 consecutive positions, at least one is distinguished. So let us consider an infix a1 ? ? ? a|S|+1 of w of length |S| + 1. It is immediate from the pigeonhole principle that there exist i < j such that ?(a1 ? ? ? ai) = ?(a1 ? ? ? aj) = ?(a1 ? ? ? ai) ? (?(ai+1 ? ? ? aj))?. Hence, the position corresponding to ai is distinguished. The proof of the second assertion is similar. J L is F +-separable from L0. We can now construct our separator. The construction follows from the next proposition. I Proposition 12. Let K ? A?+ that can be defined using an F formula ?. Then there exists an F + formula ? over alphabet A such that for every word w ? A+: w |= ? if and only if bwc |= ?. Proof. Proposition 12 follows from the following simple consequence of Lemma 11. I Claim 13. For any a ? A? there exists a formula ?a (x) of F + with a free variable x, such that for any w ? A+ and any position x of w, we have w |= ?a (x) iff x is distinguished and bxc has label a in bwc. This claim holds since by Lemma 11, formula ?a (x) only needs to explore the neighborhood of size 2|S| of x, which is trivially possible for all fragments F + we consider. To conclude the proof of Proposition 12, it suffices to define ? as the formula constructed from ? by restricting all quantifiers to positions that are distinguished and to replace all tests Pa (x) by ?a (x). J Algebraic Approach We now present an algebraic version of Theorem 4: the operator V 7? V ? D preserves decidability of separation. We would like to emphasize again that the ideas behind this theorem are essentially the same as for Theorem 4. In particular, proofs presented in the long version of this paper [19] only rely on elementary notions, thus bypassing complex constructions usually used to prove this kind of result, even if the statement itself requires some additional algebraic vocabulary. The section is organized in three parts. We first briefly recall how classes of languages corresponding to our logical fragments are given an algebraic definition: for each fragment, an associated class of finite semigroups (or monoids) V, a variety, has already been characterized, such that the class of languages definable in the fragment is exactly the class of languages that are recognized by a semigroup (or monoid) of V. In the second part, we define what ?adding the successor relation? means in this context. Given a variety V, this generally corresponds to considering a new variety built on top of V via an operation called the semidirect product. This new variety is denoted V ? D. Finally, in the last part, we state our main theorem: for any variety V, separability for the variety V ? D reduces to separability for the variety V. 5.1 Varieties A variety of semigroups (resp. monoids) is a class of finite semigroups (resp. monoids) closed under three natural operations: finite direct product, subsemigroup (or submonoid), and homomorphic image. A variety V defines a class of languages, also noted V, namely the class of all of languages recognized by semigroups (resp. monoids) in V. There is an issue however: all classes of languages defined in this way have to be closed under complement, since the set of languages recognized by any semigroup is closed under complement. This prevents us from capturing logical fragments that are not closed under complement, such as ?2(<). This problem has been solved in [11] with the notions of ordered semigroups and monoids. Intuitively, such a semigroup is parametrized by a partial order and the set of languages it recognizes is then restricted with respect to this partial order. These classical constructions will be recalled in the long version of this paper [19], as well as varieties corresponding to all fragments we deal with. All logical fragments presented in Section 2 correspond to varieties that have been fully identified. For each fragment, its non-enriched variant corresponds to a variety V of (ordered) monoids and its enriched version to the variety of (ordered) semigroups V ? D built from V. For example,the fragment FO2(<) corresponds to the variety of monoids DA and the fragment FO2(<, +1) to the variety of semigroups DA ? D [23] (see the long version [19] for a bibliography with all correspondences). 5.2 Semidirect Product The Variety D. The variety D consists of all finite ordered semigroups S such that for all s ? S and all e ? E(S), we have se = e. From a language perspective, a language L is recognized by a semigroup in D iff there exists k ? N such that membership of a word w to L only depends on the suffix of length k of w. Semidirect Product. Let M be an ordered monoid and let T be an ordered semigroup. A semidirect product of M and T is an operation that is parametrized by an action of T on M and outputs a new ordered semigroup, whose base set is M ? T . Therefore, one can obtain different semidirect products out of the same M and T , depending on the chosen action (we recall the construction in the long version [19]). One can next lift this product at the level of varieties. We are interested in the semidirect products of the form V ? D, the variety of ordered semigroups generated by all semidirect products of an ordered monoid of V by an ordered semigroup of D. The reason why we introduce such semidirect products is the following theorem, which gathers several nontrivial results from the literature. The reader is referred to the long version of this paper [19] for details. I Theorem 14. Let V be a variety corresponding to a fragment F from the ones presented in Table 1. Then, the variety corresponding to the fragment F + is V ? D. 5.3 Main Theorem We have now the machinery needed to state our main theorem. For any variety of ordered monoids V, we reduce (V ? D)-separability to V-separability. I Theorem 15. Let V be a non-trivial variety of ordered monoids. Let L and L0 be two languages both recognized by the same morphism ? : A+ ? S into a finite semigroup S. Set L, L0 ? A?+ as the languages of well-formed words associated to L, L0, respectively. Then, L is (V ? D)-separable from L0 if and only if L is V-separable from L0. The proof of Theorem 15 is presented in the full version of this paper [19]. As it was the case for Theorem 4, the proof is both elementary and constructive: if there exists a separator for L and L0 in V, we use it to construct a separator for L and L0 in V ? D. In view of Theorem 14, Theorem 15 applies to all fragments we introduced. This means that Theorem 4 can be given an alternate indirect proof within this algebraic framework by combining Theorem 15 and Theorem 14. Hence, this also yields another proof of Corollary 5. 6 Conclusion We proved that separation is decidable over finite words for the following logical fragments: FO(=, +1), ?1(<, +1, min, max), B?1(<, +1, min, max), ?2(<, +1) and FO2(<, +1). To achieve this, we presented a simple reduction to the same problem for the weaker fragments FO(=), ?1(<), B?1(<), ?2(<) and FO2(<). The reduction itself is entirely generic to all fragments and its proof is elementary, and also mostly generic. In particular, the technique can be used to prove that the reduction works for other natural fragments of first-order logic. An interesting example to which these results apply is the quantifier alternation hierarchy within FO2(<) (known as the Trotter-Weil hierarchy, and which is decidable [25]). However, the separation problem for classes in this hierarchy has yet to be investigated. We also obtained direct proofs that membership is decidable for B?2(<, +1) and ?3(<, +1). Finally, we presented an algebraic formulation of this reduction, which recovers a previously known result by Steinberg [21], while having a much simpler proof. One can expect extending these results to other fragments, such as enrichment with modulo predicates. Another advantage of this technique is that it can be extended in a straightforward way to the same logical fragments over words of infinite length. This yields identical transfer results. We leave the presentation of these results for further work. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Jorge Almeida . A syntactical proof of locality of DA . International Journal on Algebra and Computation , 6 : 165 - 177 , 1996 . Jorge Almeida . Some algorithmic problems for pseudovarieties . Publicationes Mathematicae Debrecen , 54 : 531 - 552 , 1999 . Proc. of Automata and Formal Languages , VIII. International Journal on Algebra and Computation , 20 ( 2 ): 181 - 188 , 2010 . Wojciech Czerwi?ski , Wim Martens, and Tom?? Masopust . Efficient separability of regular languages by subsequences and suffixes . In Proceedings of the 40th International Colloquium on Automata, Languages, and Programming , ICALP'13 , volume 7966 of Lecture Notes in Computer Science, pages 150 - 161 . Springer, 2013 . Volker Diekert and Paul Gastin . First-order definable languages . In Logic and Automata: History and Perspectives , volume 2 , pages 261 - 306 . Amsterdam University Press, 2008 . Christian Gla?er and Heinz Schmitz . Languages of dot-depth 3/2. Theory of Computing Systems , 42 ( 2 ): 256 - 286 , 2008 . Karsten Henckell . Pointlike sets: the finest aperiodic cover of a finite semigroup . Journal of Pure and Applied Algebra , 55 ( 1-2 ): 85 - 126 , 1988 . Robert Knast . A semigroup characterization of dot-depth one languages . Rairo Informatique Th?orique et Applications , 17 ( 4 ): 321 - 330 , 1983 . Manfred Kufleitner and Alexander Lauser . Around dot-depth 1 . International Journal of Foundations of Computer Science , 23 ( 6 ): 1323 - 1340 , 2012 . Robert McNaughton and Seymour Papert . Counter-Free Automata . MIT Press, 1971 . Jean-?ric Pin . A variety theorem without complementation . Russian Mathematics , ( Izvestija vuzov . Matematika) , 39 : 80 - 90 , 1995 . Jean-?ric Pin and Pascal Weil . Polynomial closure and unambiguous product . Theory of Computing Systems , 30 ( 4 ): 383 - 422 , 1997 . Communications in Algebra , 30 : 5677 - 5713 , 2002 . Thomas Place and Luc Segoufin . Decidable characterization of FO2(<, +1) and locality of DA . Unpublished, to appear, 2014 . Thomas Place , Lorijn van Rooijen, and Marc Zeitoun . Separating regular languages by locally testable and locally threshold testable languages . In Proceedings of the 34th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, FSTTCS'13 , volume 24 of LIPIcs , pages 363 - 375 . Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2013 . Thomas Place , Lorijn van Rooijen, and Marc Zeitoun . Separating regular languages by piecewise testable and unambiguous languages . In Proceedings of the 28th MFCS'13 , volume 8087 of Lecture Notes in Computer Science, pages 729 - 740 . Springer, 2013 . Thomas Place and Marc Zeitoun . Going higher in the first-order quantifier alternation hierarchy on words . In Proceedings of the 41th International Colloquium on Automata, Languages, and Programming , ICALP'14 , volume 8573 of Lecture Notes in Computer Science, pages 342 - 353 , 2014 . http://arxiv.org/pdf/1404.6832v1. Thomas Place and Marc Zeitoun . Separating regular languages with first-order logic . In Proceedings of the Joint Meeting of the 23rd EACSL Annual Conference on Computer Science Logic (CSL'14) and the 29th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS'14) , 2014 . Thomas Place and Marc Zeitoun . A transfer theorem for the separation problem . CoRR, abs/1501.00569 , 2015 . http://arxiv.org/abs/1501.00569. Marcel-Paul Sch?tzenberger . On finite monoids having only trivial subgroups . Information and Control , 8 : 190 - 194 , 1965 . Benjamin Steinberg . A delay theorem for pointlikes . Semigroup Forum , 63 ( 3 ): 281 - 304 , 2001 . Howard Straubing . Finite semigroup varieties of the form V ? D . Journal of Pure and Applied Algebra , 36 : 53 - 94 , 1985 . Denis Th?rien and Thomas Wilke . Over words, two variables are as powerful as one quantifier alternation . In Proceedings of the 30th Annual ACM Symposium on Theory of Computing , STOC' 98 , pages 234 - 240 . ACM, 1998 . Wolfgang Thomas . Classifying regular events in symbolic logic . Journal of Computer and System Sciences , 25 ( 3 ): 360 - 376 , 1982 . Manfred Kufleitner Pascal Weil . On logical hierarchies within FO2-definable languages . Logical Methods in Computer Science , 8 ( 3 ), 2012 .


This is a preview of a remote PDF: http://drops.dagstuhl.de/opus/volltexte/2015/4949/pdf/49.pdf

Thomas Place, Marc Zeitoun. Separation and the Successor Relation, LIPICS - Leibniz International Proceedings in Informatics, 2015, 662-675, DOI: 10.4230/LIPIcs.STACS.2015.662