Thompson's construction

In computer science, Thompson's construction algorithm, also called the McNaughton–Yamada–Thompson algorithm,[1] is a method of transforming a regular expression into an equivalent nondeterministic finite automaton (NFA).[2] This NFA can be used to match strings against the regular expression. This algorithm is credited to Ken Thompson.

Regular expressions and nondeterministic finite automata are two representations of formal languages. For instance, text processing utilities use regular expressions to describe advanced search patterns, but NFAs are better suited for execution on a computer. Hence, this algorithm is of practical interest, since it can compile regular expressions into NFAs. From a theoretical point of view, this algorithm is a part of the proof that they both accept exactly the same languages, that is, the regular languages.

An NFA can be made deterministic by the powerset construction and then be minimized to get an optimal automaton corresponding to the given regular expression. However, an NFA may also be interpreted directly.

To decide whether two given regular expressions describe the same language, each can be converted into an equivalent minimal deterministic finite automaton via Thompson's construction, powerset construction, and DFA minimization. If, and only if, the resulting automata agree up to renaming of states, the regular expressions' languages agree.

  1. ^ Alfred Vaino Aho; Monica S. Lam; Ravi Sethi; Jeffrey D. Ullman (2007). "3.7.4 Construction of an NFA from a Regular Expression" (print). Compilers : Principles, Techniques, & Tools (2nd ed.). Boston, MA, USA: Pearson Addison-Wesley. p. 159–163. ISBN 9780321486813.
  2. ^ Louden, Kenneth C. (1997). "2.4.1 From a Regular Expression to an NFA" (print). Compiler construction : Principles and Practice (3rd ed.). 20 Park Plaza Boston, MA 02116-4324, US: PWS Publishing Company. pp. 64–69. ISBN 978-0-534-93972-4.{{cite book}}: CS1 maint: location (link)