A context-free grammar is LR(1) if it can be parsed by a shift-reduce parser with a single token of lookahead. The table is generated by converting a single-stack-element NPDA (with tail recursion) into a DFA. In order to compress the LR(1) parsing table, two states can be merged if and only if the shift transitions are the same, and there would be no reduce-reduce conflicts. This does not change the grammar, and this is indeed what kyacc does (it requires a few passes over the table in order to merge states completely).
Some sources claim that the best way to compress an LR(1) table is to remove information about the single-stack-element from the state (making the number of possible states much smaller), and hope that the resulting parser has no reduce-reduce conflicts. The class of grammars which can be parsed by a parser generated using this hack is called LALR(1) .
Unfortunately a LALR(1) parser generator cannot generate a parser for the following unambiguous LR(1) grammar:
%start z z: empty cons z G '\n' G: axb 'a' x 'b' ayB 'a' y 'B' AxB 'A' x 'B' Ayb 'A' y 'b' x: empty cons '-' y y: empty cons '-' xFor example LALR(1) implementations of Yacc will report reduce-reduce conflicts even though the grammar is unambiguous, and the resulting parser will accept ab but not a-b .
Because kyacc merges states only when it does not change the grammar, it can generate more parsers than a LALR(1) parser generator. Because it merges all states which can be merged, its tables are compact, despite the claim by some authors that LALR(1) tables are more compact than LR(1) tables.
The only disadvantage seems to be that it takes longer to generate the table for complex grammars.
A. Appel, Modern Compiler Implementation in Java
$Id: lr.html,v 1.1 2001/01/08 03:25:31 kp Exp $