yacc's input typically consists of:
- a grammar -- a set of rules describing the expected syntactic structure of the input to the parser
- actions -- some C code to be invoked when a rule is recognized
- auxiliary declarations and subroutines
A
yacc-generated parser calls a low-level input scanner, called a lexical analyzer. This routine reads the input stream and separates it into items called ``tokens''. The sequence of tokens that the parser receives from the lexical analyzer is compared against the grammar rules. When a rule is recognized, an action (code that the user has supplied for this rule) is executed. Actions can return values, use values returned by previous actions, and carry out any other of the operations possible in C.
The nucleus of the yacc specification is the collection of grammar rules. Each rule describes a construct and gives it a name. For example, the following rule defines a symbol ``date'' in terms of other symbols ``month'', ``day'', and ``year'':
date : month day ',' year ;
Input such as the following would be matched by this rule:
April 16, 1961
The symbols to the right of the colon have been defined as tokens, defined in other rules in the specification, or else are literals such as the comma in the rule above. In the example, the comma is enclosed in single quotes, indicating that the comma is to appear literally in the input. The colon and semicolon serve as punctuation in the rule and have no significance in evaluating the input.
No comments:
Post a Comment