Note: Lecture 14 was a review for the first examination (plus a fire alarm and a subsequent evacuation) and so was not recorded. The first examination happened in Lecture 16.
Consider these two figures. Figure 1 shows the grammar of a simple yet pretty realistic programming language.
The possible token types are shown in the figure as capitalized words, parentheses, and operators. Note that the
actual lexemes for the tokens IF, ELSE, WHILE, and BREAK are all in lower case. BASIC
types are bool
, int
, char
,
and double
.
We now aim to construct a recursive descent parser for this grammar. In order to do that we must convert the grammar in a form that is suitable for recursive descent parsing. The result is shown in Figure 2.
We can then proceed with the development of a recursive descent parser. The result is parser.cc. The input (program to be parsed) is read from the standard input stream. The output is a printout of the respective parse tree, produced to the standard output stream. Note that the parser is simplified as much as possible. In particular the lexical units in the language are separated by blank or newline characters (so that the lexical analysis does not need to use finite automata).
Here are a few simple programs that can be successfully parsed by this parser (and as far as I can see are also semantically correct; you tell me).
Simple conditional statement:
{ int a ; if ( a != 1 ) a = 2 ; else a = 3 ; }
Factorial:
{ int N ; int product ; int j ; int output ; product = 1 ; j = 1 ; while ( j <= N ) { product = product * j ; j = j + 1 ; } output = product ; }
Bubblesort:
{ double [ 4 ] values ; int listlen ; bool swapped ; bool firstIter ; double temp ; int i ; listlen = 4 ; values [ 0 ] = 4.0 ; values [ 1 ] = 2.1 ; values [ 2 ] = 3.2 ; values [ 3 ] = 1.3 ; i = 0 ; swapped = false ; firstIter = true ; while ( firstIter == true || swapped != false ) { firstIter = false ; i = 0 ; swapped = false ; while ( i < listlen - 1 ) { if ( values [ i ] > values [ i - 1 ] ) { temp = values [ i ] ; values [ i ] = values [ i + 1 ] ; values [ i + 1 ] = temp ; swapped = true ; } i = i + 1 ; } } }
Syntax stress test (I am actually not completely sure that the program is semantically sound, as I came up with it just to test the parser):
{ int a ; int [ 2 ] b ; char [ 56 ] c ; bool d ; double e ; a = ( 8 * b + ( 32 / c ) - -32 * e ) ; b [ ( 2 * d ) + 3 * -7.0 ] = 1 ; if ( a == ( 12 || d >= 7 * ( 22 || true || false && a > 0 ) ) && a != b && ( c || b ) ) { } else { b [ 1 ] = 12 + 7 ; while ( d > 7 && ( d <= 3 && d < 3 && d <= false ) ) { if ( true && false && a > 9 ) a = 3 - -7 ; } } }
1Here are some possibly interesting Haskell programs (with types): expressing dates (kind of silly but features a few explicit instantiations of type classes) and binary search trees.
2Here is a definition of binary search trees in Prolog: bst.pl.
3State space search in Prolog: the cabbage, goat, and wolf problem; the Knight’s tour problem.