Using kext , generated token interface s, lexer interface s, and parser interface s are extended to produce new interfaces with extended ParseType s. Additionally, user code can extend lexer expression methods and parser reduction methods. Declarations and code can also be added to the body of the new interface and implementation.
%source MyLang.t [MyLang.l] [MyLang.y] %import MyLangTok [MyLangLex] [MyLangParse]where [] indicates optional arguments. The %source line names the specifications of the interfaces to be extended, and the %import line gives the names of those interfaces. The output interface is always a token interface , and if a lexer interface or parser interface was %import ed as a second argument, then the output interface will also be the same kind of interface. This allows tokens to be extended either alone in a token interface , or if desired in a lexer interface or parser interface .
As with any lexer or parser, an extended lexer or parser is compatible with another lexer or parser if it imports the same token interface , even if the token interface is itself a lexer interface or parser interface .
parseType: {val: INTEGER; t: TEXT}means parseType is extended with the fields val and t added to the object type. If parseType is not a Token then the fields can have initializations (current implementation restriction).
exprName {RETURN NEW(STRING, val := 15, t := $)} exprName {$R STRING{$$ := 15; $$.t := $}}both mean that the expression method named exprName is overridden to return a new STRING token whose val field is initialized to 15 , and whose t field is initialized to the TEXT of the matched token. The second version might not call NEW if previously allocated tokens of type STRING have been discard() ed but not detach() ed (see below).
As in other lex es, the type of the result can depend upon the matched text, so long as the result is a Token . This means, for example, that the returned type need not be the same as in the default method.
ruleName {$$ := $1 + $2}assumes that the last line of the form parseType: {...} refers to the return type of a rule declared as ruleName . The meaning is that the reduction method (named ruleName_returnType ) is overridden. "$1" and "$2" refer respectively to the first and second nonconstant symbols appearing in the reduction method declaration . It is assumed that the types of both of these symbols and the return symbol have a field named val , which in this case must be of type INTEGER . The line could have been written equivalently as
ruleName {$$.val := $1.val + $2.val}To refer to the ParseType itself, "$x" is written "$x.detach()" , and "$$" is written "result" . For example, consider a rule which reduces its only nonconstant symbol to its own type (imagine removing balanced parentheses). The following two lines in an extfile
ruleName {$$ := $1} ruleName {result := $1.detach()}do not quite mean the same thing. Even if the only field in returnType is val , the generated interface may be further extended, so that in the second version copying the ParseType copies more fields than just val . Additionally, if ruleName is extended then the extended method will see result = $1.detach() .
Detaching ParseType s gives a simple way to construct parse trees whose nodes are the ParseType s themselves. For an example, see CalcParseTree.e .
%place { (* your Modula-3 code here *) }For a token interface , %place can be either %interface or %module . For a lexer interface or parser interface , %place can also be %public , %private , or %overrides . The code is inserted into the generated interface as follows:
INTERFACE MyLangThingSpecial; IMPORT ...; %interface TYPE T <: Public Public = MyLangThing.T OBJECT %public END; ...and code is inserted into the generated implementation as follows:
MODULE MyLangThingSpecial; IMPORT ...; %module REVEAL T = Public BRANDED OBJECT %private ... OVERRIDES %overrides ... END; ...
$Id: kext.html,v 1.2 2001/01/08 06:53:22 kp Exp $