Canonical LR parser

A canonical LR parser (also called a LR(1) parser) is a type of bottom-up parsing algorithm used in computer science to analyze and process programming languages. It is based on the LR parsing technique, which stands for "left-to-right, rightmost derivation in reverse."

Formally, a canonical LR parser is an LR(k) parser for k=1, i.e. with a single lookahead terminal. The special attribute of this parser is that any LR(k) grammar with k>1 can be transformed into an LR(1) grammar.[1] However, back-substitutions are required to reduce k and as back-substitutions increase, the grammar can quickly become large, repetitive and hard to understand. LR(k) can handle all deterministic context-free languages.[1] In the past this LR(k) parser has been avoided because of its huge memory requirements in favor of less powerful alternatives such as the LALR and the LL(1) parser. Recently, however, a "minimal LR(1) parser" whose space requirements are close to LALR parsers[citation needed], is being offered by several parser generators.

Like most parsers, the LR(1) parser is automatically generated by compiler-compilers like GNU Bison, MSTA, Menhir,[2] HYACC,[3] and LRSTAR.[4]

  1. ^ a b Knuth, D. E. (July 1965). "On the translation of languages from left to right". Information and Control. 8 (6): 607–639. doi:10.1016/S0019-9958(65)90426-2.
  2. ^ "What is Menhir?". INRIA, CRISTAL project. Retrieved 29 June 2012.
  3. ^ "HYACC, minimal LR(1) parser generator".
  4. ^ "LRSTAR parser generator".