A better Scheme reader
The previous Scheme reader was based on recursive descent. But what I really wanted was something like LR(0) parsing, in order to avoid reading an extra token. In LR(0) parsing, a starting token is put on a stack, and then a state machine decides what to do next. You can read in a token and “shift” it to a stack, “reduce” the end items on the stack to a single item, or “halt” a successful parse. Lisp is simple enough that the one item on the stack top represents a parse state.
The new method is much simpler. It has an initialization procedure that starts the stack with one token, a parsing decision iterator (in tail recursive form), and a reducing procedure. The quote read-macro processing is now in a procedure.
The reduce procedure has a check for the empty list, in case an S-expression with dot notation is malformed.
The data loss of reading an S-expression is now reduced to the one character that follows the S-expression. By requiring whitespace between top level S-expressions, we can exchange data files with other Lisps. We can also start to store S-expression data in separate files to help organize what we have.
The read-token procedure, the lexer, handles the bulk of the parsing. It is fairly big because numbers are handled very inefficiently.
(read-object
(lambda ()
(letrec
(
;;; make unique objects to put in stack
(tick-token '(tick-token))
(lpar-token '(lpar-token))
(rpar-token '(rpar-token))
(dot-token '(dot-token))
(read-obj
(lambda ()
(read-o (list (read-token)))))
;;; We use eq? here to ensure we test
;;; for the unique objects we created.
(read-o
(lambda (stack)
(cond
((eq? (car stack) lpar-token) (read-o (cons (read-token) stack)))
((eq? (car stack) rpar-token) (read-o (reduce (cdr stack) '())))
((eq? (car stack) tick-token) (read-o (cons (read-quote) (cdr stack))))
((null? (cdr stack)) (car stack)) ; terminate when only one non-delimiter on stack
(#t (read-o (cons (read-token) stack))))))
;;; Here we combine several list items
;;; into a single list, and restack.
;;; We return a stack because only this
;;; procedure knows how much to discard.
;;; We use eq? here to ensure we test
;;; for the unique objects we created.
(reduce
(lambda (stack tail)
(cond
((null? (car stack)) (cons tail (cdr stack)))
((eq? (car stack) lpar-token) (cons tail (cdr stack)))
((eq? (car stack) dot-token) (reduce (cddr stack) (cons (cadr stack) (car tail))))
(#t (reduce (cdr stack) (cons (car stack) tail))))))
;;; prototype for a read-macro
(read-quote
(lambda ()
(list 'quote (read-obj))))
;;;
;;; token code omitted
;;;
)
(read-obj))))