Archive for August, 2007

Bootstrapping Lisp – the canonical evaluator, eval

Sunday, August 26th, 2007

With this latest series of builds, I went straight into implementing the classic canonical Lisp evaluator, but without label or atom. The coded evaluator also knows about read, print, and eval. A stop function was also added for a clean exit back to the OS. The canonical evaluator does not know about closures.

The atom function was not implemented because it can be handled with eq, although that introduces an implementation detail.

The bootstrap Lisp system implements a REPL. Because the standard input is used, a test script can be run by using file redirection. File redirection can also be used to store the results.

Because the evaluator, eval, can be invoked from the REPL, the label “function” has not been implemented. Lambda functions can be bound to names in the environment argument of eval.

The primary differences are that f does not evaluate to nil, and the cond function tests against true (t) rather than false (as in Scheme) or nil (as in Common Lisp).

I made my task harder by having a copy collector, and constraining the rest of the code to using only 3 registers. With a copy collector, you can’t put a node address on the machine stack unless you’re absolutely sure you won’t invoke the garbage collector. If you want to leave the node addresses on the machine stack while collecting, you need to detect or mark the nodes as immovable and update the nodes in-place. That complicates the collector.

Since the collector is written in assembly language, is working, and is a critical part of the Lisp system, it made little sense at this point to upgrade it. There was more work to be done. It has been known as early as the days of Newell, Simon, and Shaw, that stacks can be implemented as lists. So the eval stack has been implemented in heap space.

At this point, there are two mainstream directions to take Lisp: towards Common Lisp, or towards Scheme. One can also resurrect old versions of Lisp, or create new versions.

Bootstrapping Lisp – new list reader

Wednesday, August 15th, 2007

The new PRINTD is enough like the usual Lisp PRINT function, that I’ve renamed it to PRINT.

A new READ function has been added to read the basic Lisp notation (which includes the single dot notation for improper lists.) The lexer breaks down the input stream into individual symbols called tokens. After a few tries, the lexer finally got reduced to leaving a lookahead character for multicharacter tokens, and no lookahead character for single character tokens. This was to mimic the input behavior illustrated in the book Lisp (1st ed.), by Winston & Horn, where typing the final right parenthesis apparently ended the reader and invoked the evaluator. The grammar for basic Lisp notation became

<expr> ::= <atom> | “(” “)” | “(” <expr>* [ "." <expr> ] “)”

where * means zero or more occurrences and [ ] means optional.

In order to minimize CONS calls, each sublist is built in reverse order and then destructively reversed.

After the reader was generating the correct structure, the lexer was changed to use a symbol table in the form of an atom list. The symbol table allows the lexer to map a given print name to the same atom everywhere the name appears.

The combination of the Windows/Intel ABI, the copy collector, and the nonuniform handling of lookahead made the reader a wee bit harder to implement than it might have been.