parserlib: Parser Generics for Modula-3

The situation often arises that several different programs need to parse the same language. Traditionally, each program would have its own generated parser. This makes it hard for these programs to share the parser or extensions to the parser. It also tends to allow for minute variations in the language that may make the programs incompatible.

In parserlib , the traditional lexer/parser abstraction is used as a starting point. However, rather than directly importing each other, the lexer and parser know only about a token interface which makes them compatible. Also, there is no user code inserted into the generated lexers and parsers. The generated lexers and parsers only check the grammar of the input. Additional functionality is added by extending each lexer or parser using ext . Each extended parser or lexer can itself be further extended, allowing highly modular hierarchies of parsers and lexers.

The lexers generated by parserlib use convenient Modula-3 Rd.T s and TEXT s for their input, allowing language generated on the fly to be parsed. The generated parsers can also be instructed not to exhaust their input. This allows a program to parse a language embedded in another language without having an extra pass to match delimiters.

Finally, the generated lexers and parsers are OBJECT s and do not rely on global variables, so that many streams can be parsed concurrently by a single process.

using parserlib

m3build support

reading from standard input

calculator example

package installation

The following items are required in order to build parserlib :
parserlib directory containing sources.
cit_util contains TextReader.i3 and RTBrand.i3 .
term terminal I/O used only in debugging klex itself.
m3overrides This file must be in the same directory as parserlib . It must specify locations of term and cit_util libraries, and must disable shared library generation if programs are not shipped.

The Makefile in the parserlib directory builds the following constituent programs:

ktok generates token interface and implementation.
klex generates lexer interface and implementation.
kyacc generates parser interface and implementation.
kext generates interfaces and implementations that extend generated token, lexer, and parser interfaces and implementations.

Finally, a library called parserlib (in parserlib/parserlib ) contains the SeekRd interface which is imported by generated lexers, and the m3build template which finds and runs the built constituent programs.

acknowledgements

Special thanks to Mika Nyström for the basic idea.

    - Karl Papadantonakis


[ parserlib page ] [ ktok ] [ klex ] [ kyacc ] [ kext ] [ m3build ]

$Id: index.html,v 1.6 2001/02/27 00:57:57 kp Exp $