Chomsky normal form

In formal language theory, a context-free grammar, G, is said to be in Chomsky normal form (first described by Noam Chomsky)[1] if all of its production rules are of the form:[2][3]

ABC,   or
Aa,   or
S → ε,

where A, B, and C are nonterminal symbols, the letter a is a terminal symbol (a symbol that represents a constant value), S is the start symbol, and ε denotes the empty string. Also, neither B nor C may be the start symbol, and the third production rule can only appear if ε is in L(G), the language produced by the context-free grammar G.[4]: 92–93, 106 

Every grammar in Chomsky normal form is context-free, and conversely, every context-free grammar can be transformed into an equivalent one[note 1] which is in Chomsky normal form and has a size no larger than the square of the original grammar's size.

  1. ^ Chomsky, Noam (1959). "On Certain Formal Properties of Grammars". Information and Control. 2 (2): 137–167. doi:10.1016/S0019-9958(59)90362-6. Here: Sect.6, p.152ff.
  2. ^ D'Antoni, Loris. "Page 7, Lecture 9: Bottom-up Parsing Algorithms" (PDF). CS536-S21 Intro to Programming Languages and Compilers. University of Wisconsin-Madison. Archived (PDF) from the original on 2021-07-19.
  3. ^ Sipser, Michael (2006). Introduction to the theory of computation (2nd ed.). Boston: Thomson Course Technology. Definition 2.8. ISBN 0-534-95097-3. OCLC 58544333.
  4. ^ Hopcroft, John E.; Ullman, Jeffrey D. (1979). Introduction to Automata Theory, Languages and Computation. Reading, Massachusetts: Addison-Wesley Publishing. ISBN 978-0-201-02988-8.


Cite error: There are <ref group=note> tags on this page, but the references will not show without a {{reflist|group=note}} template (see the help page).