The SP language


The Simple Programming language is a toy language that is easy to parse and easy to interpret, yet powerful enough to solve problems from number theory. Its purpose is to demonstrate the formal definition of languages using the Backus-Naur notation.

The language

SP has four statements:

READ var Read an int from the user, and store it in var
WRITE expr Print the value of expr
LET var = expr Set value of var to expr
GOTO label [IF expr] Jump to the specified line. If expr is present, the jump is conditional on expr ≧ 0

Here's the complete BNF for SP:

<program> ::= <progline> | <program> <progline>
<progline> ::= <number> : <stmt> ; | <stmt> ;
<stmt> ::= READ <var_id> |
           WRITE <expr> |
	   LET <var_id> = <expr> |
	   GOTO <number> |
	   GOTO <number> IF <expr>
<expr> ::= <term> | <term> + <term> | <term> - <term>
<term> ::= <factor> | <factor> * <factor> | <factor> / <factor>
<factor> ::= ( <expr> ) | <number> | <var_id>
<var_id> ::= LETTER | LETTER DIGIT
<number> ::= DIGIT | <number> DIGIT

There are 33 variables available for programs, named X, Y, Z, X0, …, X9, Y0, …, Y9, Z0, …, Z9. The type of variables and expressions is uniformly signed integer.

Sample SP program

The following program calculates the greatest common denominator of two nonnegative numbers X and Y, using a somewhat naïve approach. You can download it to test your own parser.


100:	GOTO 200 IF Y - X;
	LET X = X - Y;
	GOTO 100;

200:	GOTO 300 IF X - Y;
	LET Y = Y - X;
	GOTO 100;

300:	WRITE X;

Implementing SP

Python interpreter

This Python program parses and interprets SP programs. Parsing is done via the Yappy parser generator, and the program is internally represented as a list of statement objects. Expressions are represented as binary trees, taking advantage of the fact that the grammar definition of SP allows for no non-binary operations.

Yappy provides a very nice interface for writing parsers. Note how cleanly semantic actions are described in the grammar table. The single fault I can find with Yappy is that both its lexer and its parser generator require the usage of strings for id's. Allowing arbitrary identities would make for even nicer code.

Python compiler

This compiler, also written in Python and obviously based on the above code, compiles SP programs into IA-32 assembly. Expressions are compiled into assembly instructions putting results on the stack. IO is implemented via libc's printf and scanf functions.

Obviously, because of the very nature of the SP language, no fancy type calculus or memory layouting is done. It does, however, demonstrate how there's nothing magic about writing a compiler.

Yes, the implementation (unified with the interpreter's code) needs to be refactored into multidispatch methods.