Alternation (formal language theory)

In formal language theory and pattern matching, alternation is the union of two sets of strings, or equivalently the logical disjunction of two patterns describing sets of strings.

Regular languages are closed under alternation, meaning that the alternation of two regular languages is again regular.[1] In implementations of regular expressions, alternation is often expressed with a vertical bar connecting the expressions for the two languages whose union is to be matched,[2][3] while in more theoretical studies the plus sign may instead be used for this purpose.[1] The ability to construct finite automata for unions of two regular languages that are themselves defined by finite automata is central to the equivalence between regular languages defined by automata and by regular expressions.[4]

Other classes of languages that are closed under alternation include context-free languages and recursive languages. The vertical bar notation for alternation is used in the SNOBOL language and some other languages. In formal language theory, alternation is commutative and associative. This is not in general true of the form of alternation used in pattern-matching languages, because of the side-effects of performing a match in those languages.

  1. ^ a b Linz, Peter (2006). "Theorem 4.1". An Introduction to Formal Languages and Automata. Jones & Bartlett Learning. pp. 100–101. ISBN 9780763737986.
  2. ^ Fitzgerald, Michael (2012). "Alternation". Introducing Regular Expressions: Unraveling Regular Expressions, Step-by-Step. O'Reilly Media. pp. 43–45. ISBN 9781449338893.
  3. ^ "Alternation with The Vertical Bar". regular-expressions.info. Retrieved 2021-11-11.
  4. ^ Cooper, Keith; Torczon, Linda (2011). Engineering a Compiler (2nd ed.). Elsevier. p. 41. ISBN 9780080916613.