Archive for August, 2008

Tail calls, continued

Tuesday, August 26th, 2008

If we look closely at the tail call optimizations, we can see the beginnings of a continuation-based implementation. One characteristic of a continuation is the abandonment of an environment, and replacing it with another.

    (evlist-cmd)
    (set! cmddef (car result))
    (set! arglist (cdr result))
    (set! envstk (cdr envstk))   ; -- remove environment
    (evapply-cmd)                ; -- may stack another environment

It seems that a continuation is a procedure that will be called as a tail call, complete with tail call optimization. Now that I think of it, the textbook example I saw of primitive continuations start out showing them as tail calls.

We can gather up the code and reshuffle which procedures the stacking and unstacking of various stacks occur. Adding redundant expressions to allow the use of the tail call versions of the evaluator procedures, we might come up with the following pair for our conditional tail calls.

    (proc-call-tail
      (lambda ()
        (set! seq (procbody cmddef))
        (cond
          ( (null? seq) (undefined-tail) )
          ( #t
            (set! envstk (cdr envstk))
            (set! envstk (cons (add-args (procformals cmddef) arglist (procenv cmddef)) envstk))
            (evseqhead-tail)
            (cond
              ( (constant? exp)       (evconst-tail) )
              ( (symbol? exp)         (evvar-tail) )
              ( (symbol? (car exp))
                (set! cmddef (bound-value (lookup (car exp) (car envstk))))
                (cond
                  (
                    (eq? (first cmddef) prim-syntax-mark)
                    (cond
                      ( (eq? (second cmddef) 'quote)    (evquote-tail) )
                      ( (eq? (second cmddef) 'cond)
                        (evcondhead-tail)
                        (evseq-tail)
                      )
                      ( (eq? (second cmddef) 'set!)     (evset-tail) )
                      ( (eq? (second cmddef) 'letrec)   (evletrec-tail) )
                      ( (eq? (second cmddef) 'lambda)   (evlambda-tail) )
                      ( #t                              (undefined-tail) )
                    )
                  )
                  ( #t                    (evcall-tail) )
                )
              )
              ( #t                (evcall-tail) )
            )
          )
        )
      )
    )

    (evcond-tail
      (lambda ()
        (evcondhead-tail)
        (set! envstk (cons (car envstk) envstk))  ; -- duplicate env TOS
        (evseq-tail)
        (set! envstk (cdr envstk))
      )
    )

Because of the duplication of the environment, the stacking and unstacking in evcond-tail is redundant. After removal, the tail call optimization of tail cond expressions is complete, and proc-call-tail can call evcond-tail directly. This leads to further reductions…

    (proc-call-tail
      (lambda ()
        (set! seq (procbody cmddef))
        (set! envstk (cdr envstk))
        (set! envstk (cons (add-args (procformals cmddef) arglist (procenv cmddef)) envstk))
        (evseq-tail)
      )
    )

The unstacking followed immediately by the stacking of the environment can be redone as the non-consing set-car! for direct replacement of the top of the environment stack.

    (proc-call-tail
      (lambda ()
        (set! seq (procbody cmddef))
        (set-car! envstk (add-args (procformals cmddef) arglist (procenv cmddef)))
        (evseq-tail)
      )
    )

My first reduction of letrec processing was the following.

    (evletrec-tail
      (lambda ()
        (set! letstk (cons exp letstk))
        (evletvar-tail)

        (set! envstk (cons (append result (car envstk)) envstk))

        (set! lvalstk (cons '() lvalstk))
        (evletexp-tail)
        (set! exp (car lvalstk))
        (set! lvarstk (cons (car envstk) lvarstk))
        (evletbind-tail)
        (set! lvarstk (cdr lvarstk))
        (set! lvalstk (cdr lvalstk))

        (set! seq (cdr(cdr (car letstk))))

        (set! envstk (cons (car envstk) envstk))  ; -- duplicate env TOS
        (evseq-tail)
        (set! envstk (cdr envstk))

        (set! letstk (cdr letstk))
        (set! envstk (cdr envstk))
      )
    )

The stack and unstacking of envstk surrounding evseq-tail are redundant. The stacking of the environment after evletvar-tail is unstacked by evseq-tail. We can create a redundant unstack-stack pair at the beginning because evletvar-tail does not use the environment.

    (evletrec-tail
      (lambda ()
        (set! envstk (cdr envstk))
        (set! envstk (cons (car envstk) envstk))  ; -- duplicate env TOS

        (set! letstk (cons exp letstk))
        (evletvar-tail)

        (set! envstk (cons (append result (car envstk)) envstk))

        (set! lvalstk (cons '() lvalstk))
        (evletexp-tail)
        (set! exp (car lvalstk))
        (set! lvarstk (cons (car envstk) lvarstk))
        (evletbind-tail)
        (set! lvarstk (cdr lvarstk))
        (set! lvalstk (cdr lvalstk))

        (set! seq (cdr(cdr (car letstk))))
        (evseq-tail)
        (set! letstk (cdr letstk))

        (set! envstk (cdr envstk))
      )
    )

Because evseq-tail performs the matching unstack to the stack of the environment after evletvar-tail, we can move the trailing environment unstacking to just after evletvar-tail. And because evletvar-tail does not use the environment, we can move the final unstack to just after the duplication of the top of environment stack. The two cancel out, and the remaining unstack at the beginning can be moved down for similar reasons.

    (evletrec-tail
      (lambda ()
        (set! letstk (cons exp letstk))
        (evletvar-tail)

        (set! envstk (cdr envstk))
        (set! envstk (cons (append result (car envstk)) envstk))

        (set! lvalstk (cons '() lvalstk))
        (evletexp-tail)
        (set! exp (car lvalstk))
        (set! lvarstk (cons (car envstk) lvarstk))
        (evletbind-tail)
        (set! lvarstk (cdr lvarstk))
        (set! lvalstk (cdr lvalstk))
        (set! seq (cdr(cdr (car letstk))))
        (evseq-tail)
        (set! letstk (cdr letstk))
      )
    )

The unstack followed by stack can be redone with a set-car!. And because the letrec stack is not needed after the expression sequence has been pulled out, it can be unstacked before calling evseq-tail.

    (evletrec-tail
      (lambda ()
        (set! letstk (cons exp letstk))
        ;; bind variables to ***undefined***
        ;; return list of bindings in "result"
        (evletvar-tail)
        (set-car! envstk (append result (car envstk)))
        (set! lvalstk (cons '() lvalstk))
        ;; create list of "let" initialization values
        ;; return in lvalstk
        (evletexp-tail)
        (set! exp (car lvalstk))
        ;; rebind "let" variables
        (set! lvarstk (cons (car envstk) lvarstk))
        (evletbind-tail)
        (set! lvarstk (cdr lvarstk))
        (set! lvalstk (cdr lvalstk))
        (set! seq (cdr(cdr (car letstk))))
        (set! letstk (cdr letstk))
        (evseq-tail)
      )
    )

The lvarstk variable is renamed to letbindstk because it doesn’t stack lists of variables, but lists of variable bindings. The stacking and unstacking of lvarstk are also “hoisted” from evletvar-tail.

    (evletrec-tail
      (lambda ()
        (set! letstk (cons exp letstk))
        (set! letbindstk (cons '() letbindstk))
        ;; bind variables to ***undefined***
        ;; return list of bindings in letbindstk
        (evletvar-tail)
        (set! result (car letbindstk))
        (set-car! envstk (append result (car envstk)))
        (set! lvalstk (cons '() lvalstk))
        ;; create list of "let" initialization values
        ;; return in lvalstk
        (evletexp-tail)
        (set! exp (car lvalstk))
        ;; mutate "let" variable bindings
        (evletbind-tail)
        (set! letbindstk (cdr letbindstk))
        (set! lvalstk (cdr lvalstk))
        (set! seq (cdr(cdr (car letstk))))
        (set! letstk (cdr letstk))
        (evseq-tail)
      )
    )

Conditional tail call

Friday, August 22nd, 2008

We can also inline and reorganize the evcond-cmd call within proc-call-cmd. The following handles the most common case where a successful tail cond test will call a tail procedure. It will not optimize nested tail cond expressions. (Although, if the nested tail cond expressions are broken up into procedure calls, that will create proper tail calls.)

The unwieldy code that results is basically a proof-of-concept code that shows how the two fundamental tail call optimizations can be implemented. The main reason for the monolithic-style code is to ensure the proper pairing of environment stacking and unstacking.

    (proc-call-cmd
      (lambda ()
        (set! seq (procbody cmddef))
        (cond
          ( (null? seq) (undefined-cmd) )
          ( #t
            (set! envstk (cons (add-args (procformals cmddef) arglist (procenv cmddef)) envstk))
            (set! seqstk (cons seq seqstk))
            (evseqhead-cmd)
            (set! exp (car (car seqstk)))
            (set! seqstk (cdr seqstk))
            (cond
              ( (constant? exp)
                (evconst-cmd)
                (set! envstk (cdr envstk))
              )
              ( (symbol? exp)
                (evvar-cmd)
                (set! envstk (cdr envstk))
              )
              ( (symbol? (car exp))
                (set! cmddef (bound-value (lookup (car exp) (car envstk))))
                (cond
                  (
                    (eq? (first cmddef) prim-syntax-mark)
                    (cond
                      ( (eq? (second cmddef) 'cond)
                        (set! condstk (cons (cdr exp) condstk))
                        (evcondhead-cmd)
                        (set! seq (cdr (car (car condstk))))
                        (set! condstk (cdr condstk))   ; -- moved !
                        (cond
                          ( (null? seq)
                            (undefined-cmd)
                            (set! envstk (cdr envstk))
                          )
                          ( #t
                            (set! seqstk (cons seq seqstk))
                            (evseqhead-cmd)
                            (set! exp (car (car seqstk)))
                            (set! seqstk (cdr seqstk))
                            (cond
                              ( (constant? exp)
                                (evconst-cmd)
                                (set! envstk (cdr envstk))
                              )
                              ( (symbol? exp)
                                (evvar-cmd)
                                (set! envstk (cdr envstk))
                              )
                              ( (symbol? (car exp))
                                (set! cmddef (bound-value (lookup (car exp) (car envstk))))
                                (cond
                                  (
                                    (eq? (first cmddef) prim-syntax-mark)
                                    (cond
                                      ( (eq? (second cmddef) 'quote)     (evquote-cmd) )
                                      ( (eq? (second cmddef) 'cond)      (evcond-cmd) )
                                      ( (eq? (second cmddef) 'set!)      (evset-cmd) )
                                      ( (eq? (second cmddef) 'letrec)    (evletrec-cmd) )
                                      ( (eq? (second cmddef) 'lambda)    (evlambda-cmd) )
                                      ( #t                               (undefined-cmd) )
                                    )
                                    (set! envstk (cdr envstk))
                                  )
                                  ( #t
                                    (evlist-cmd)
                                    (set! cmddef (car result))
                                    (set! arglist (cdr result))
                                    (set! envstk (cdr envstk))   ; -- moved !
                                    (evapply-cmd)
                                  )
                                )
                              )
                              ( #t
                                (evlist-cmd)
                                (set! cmddef (car result))
                                (set! arglist (cdr result))
                                (set! envstk (cdr envstk))   ; -- moved !
                                (evapply-cmd)
                              )
                            )
                          )
                        )
                      )
                      ( (eq? (second cmddef) 'quote)
                        (evquote-cmd)
                        (set! envstk (cdr envstk))
                      )
                      ( (eq? (second cmddef) 'set!)
                        (evset-cmd)
                        (set! envstk (cdr envstk))
                      )
                      ( (eq? (second cmddef) 'letrec)
                        (evletrec-cmd)
                        (set! envstk (cdr envstk))
                      )
                      ( (eq? (second cmddef) 'lambda)
                        (evlambda-cmd)
                        (set! envstk (cdr envstk))
                      )
                      ( #t
                        (undefined-cmd)
                        (set! envstk (cdr envstk))
                      )
                    )
                  )
                  ( #t
                    (evlist-cmd)
                    (set! cmddef (car result))
                    (set! arglist (cdr result))
                    (set! envstk (cdr envstk))
                    (evapply-cmd)
                  )
                )
              )
              ( #t
                (evlist-cmd)
                (set! cmddef (car result))
                (set! arglist (cdr result))
                (set! envstk (cdr envstk))
                (evapply-cmd)
              )
            )
          )
        )
      )
    )

Unconditional tail call

Thursday, August 21st, 2008

Tail call optimization was not necessarily a feature inspired by Lisp and functional languages. It was a fairly common assembly language optimization (by non-Lispers) in the early (we’re talking late 1970’s) microprocessor days, when arguments were passed in registers. An example using Pentium code…

; the following sequence

    mov   eax,var1   ; argument 1
    mov   ebx,var2   ; argument 2
    call  proc       ; stores a return address on the stack
    ret              ;

; can be reduced to

    mov   eax,var1   ; argument 1
    mov   ebx,var2   ; argument 2
    jmp   proc       ; does not add a new return address

The key to this optimization is ensuring that arguments and “local” variables of the current procedure are popped off stacks BEFORE the tail procedure call. Procedures can have differing numbers of arguments, so popping arguments after a modified return can lead to stack unbalancing. Putting any data between return addresses, as is typical of some calling conventions, means that failure to put a return address on top can leave the stack pointer improperly located for a return instruction.

First, we redo the expression sequence processor by making the empty sequence a special case, and then pull the last (tail) expression out of the loop…

    (evseq-cmd
      (lambda ()
        (cond
          ( (null? seq) (undefined-cmd) )
          ( #t
            (set! seqstk (cons seq seqstk))
            (while (not (null? (cdr (car seqstk))))
              (set! exp (car (car seqstk)))
              (ev-cmd)
              (set-car! seqstk (cdr (car seqstk)))
            )
            (set! exp (car (car seqstk)))
            (ev-cmd)
            (set-car! seqstk (cdr (car seqstk)))
            (set! seqstk (cdr seqstk))
          )
        )
      )
    )

The stack level of seqstk created here is not used by the internal call to ev-cmd, so the set! and set-car! expressions can be moved from behind to in front of the ev-cmd calls.

    (evseq-cmd
      (lambda ()
        (cond
          ( (null? seq) (undefined-cmd) )
          ( #t
            (set! seqstk (cons seq seqstk))
            (while (not (null? (cdr (car seqstk))))
              (set! exp (car (car seqstk)))
              (set-car! seqstk (cdr (car seqstk)))
              (ev-cmd)
            )
            (set! exp (car (car seqstk)))
            (set-car! seqstk (cdr (car seqstk)))
            (set! seqstk (cdr seqstk))
            (ev-cmd)
          )
        )
      )
    )

The final set-car! is redundant…

    (evseq-cmd
      (lambda ()
        (cond
          ( (null? seq) (undefined-cmd) )
          ( #t
            (set! seqstk (cons seq seqstk))
            (while (not (null? (cdr (car seqstk))))
              (set! exp (car (car seqstk)))
              (set-car! seqstk (cdr (car seqstk)))
              (ev-cmd)
            )
            (set! exp (car (car seqstk)))
            (set! seqstk (cdr seqstk))
            (ev-cmd)
          )
        )
      )
    )

And then we inline the code into proc-call-cmd, and stack a new environment when the procedure is valid. It might be noted that if we could guarantee that a procedure body was never empty (or like traditional Lisp, cdr[′()] = ′()) then the cond would be unnecessary.

    (proc-call-cmd
      (lambda ()
        (set! seq (procbody cmddef))
        (cond
          ( (null? seq) (undefined-cmd) )
          ( #t
            (set! envstk (cons (add-args (procformals cmddef) arglist (procenv cmddef)) envstk))
            (set! seqstk (cons seq seqstk))
            (while (not (null? (cdr (car seqstk))))
              (set! exp (car (car seqstk)))
              (set-car! seqstk (cdr (car seqstk)))
              (ev-cmd)
            )
            (set! exp (car (car seqstk)))
            (set! seqstk (cdr seqstk))
            (ev-cmd)
            (set! envstk (cdr envstk))
          )
        )
      )
    )

Some more inlining and then distributing the environment unstacking among the conditional branches allows moving the unstacking instruction before the recursion back to evapply-cmd. The tail calls to proc-call-cmd and evapply-cmd can be implemented as simple jumps. Thus we have the first tail call optimization – the return stack doesn’t grow, and the data stacks that would normally grow are shrunk before an unconditional tail procedure call is interpreted.

    ;;; cmddef = procedure
    ;;; arglist = evaluated argument list
    (evapply-cmd
      (lambda ()
        (cond
          ( (eq? (first cmddef) mach-mark)   (mach-call-cmd) )
          ( (eq? (first cmddef) prim-mark)   (prim-call-cmd) )
          ( (eq? (first cmddef) proc-mark)   (proc-call-cmd) )
          ( #t                               (undefined-cmd) )
        )
      )
    )

    (proc-call-cmd
      (lambda ()
        (set! seq (procbody cmddef))
        (cond
          ( (null? seq) (undefined-cmd) )
          ( #t
            (set! envstk (cons (add-args (procformals cmddef) arglist (procenv cmddef)) envstk))
            (set! seqstk (cons seq seqstk))
            (while (not (null? (cdr (car seqstk))))
              (set! exp (car (car seqstk)))
              (set-car! seqstk (cdr (car seqstk)))
              (ev-cmd)
            )
            (set! exp (car (car seqstk)))
            (set! seqstk (cdr seqstk))
            (cond
              ( (constant? exp)
                (evconst-cmd)
                (set! envstk (cdr envstk))
              )
              ( (symbol? exp)
                (evvar-cmd)
                (set! envstk (cdr envstk))
              )
              ( (symbol? (car exp))
                (set! cmddef (bound-value (lookup (car exp) (car envstk))))
                (cond
                  (
                    (eq? (first cmddef) prim-syntax-mark)
                    (cond
                      ( (eq? (second cmddef) 'quote)     (evquote-cmd) )
                      ( (eq? (second cmddef) 'cond)      (evcond-cmd) )
                      ( (eq? (second cmddef) 'set!)      (evset-cmd) )
                      ( (eq? (second cmddef) 'letrec)    (evletrec-cmd) )
                      ( (eq? (second cmddef) 'lambda)    (evlambda-cmd) )
                      ( #t                               (undefined-cmd) )
                    )
                    (set! envstk (cdr envstk))
                  )
                  ( #t
                    (set! argstk (cons '() argstk))
                    (evargs-cmd)
                    (set! result (car argstk))
                    (set! argstk (cdr argstk))

                    (set! cmddef (car result))
                    (set! arglist (cdr result))
                    (set! envstk (cdr envstk))   ; -- moved !
                    (evapply-cmd)
                  )
                )
              )
              ( #t
                (set! argstk (cons '() argstk))
                (evargs-cmd)
                (set! result (car argstk))
                (set! argstk (cdr argstk))

                (set! cmddef (car result))
                (set! arglist (cdr result))
                (set! envstk (cdr envstk))  ; -- moved !
                (evapply-cmd)
              )
            )
          )
        )
      )
    )

Redoing the evaluator

Monday, August 18th, 2008

Well, I ran into some problems while expanding the rework, so I’m backing up and starting over.

Backing up to the -stk versions, a new series of -cmd procedures is created. We want to get rid of all arguments in the evaluation recursion chains. That generally means putting arguments on the stack. Temporary values must be either used immediately or stored on a stack. We start with multiple stacks, one for each kind of value.

First, the easy stuff. The starting point is at eval, and there is no command loop. The rest of it looks almost the same as in the last post. A slight difference is using the name exp instead of cmd. We also go back to using a form of lookup that does not know where the lookup table (environment) is located.

    (eval
      (lambda (x env)
        (evaluator-cmd x env)
      )
    )

    (evaluator-cmd
      (lambda (x env)
        (set! envstk (cons env envstk))   ; push
        (set! exp x)
        (ev-cmd)
        (set! envstk (cdr envstk))        ; pop
        result
      )
    )

    ;;; exp = expression form
    ;;; (car envstk) = current environment
    (ev-cmd
      (lambda ()
        (cond
          ( (constant? exp)      (evconst-cmd) )
          ( (symbol? exp)        (evvar-cmd) )
          ( (symbol? (car exp))  (evsyntax-cmd) )
          ( #t                   (evcall-cmd) )
        )
      )
    )

    (evsyntax-cmd
      (lambda ()
        (set! cmddef (bound-value (lookup (car exp) (car envstk))))
        (cond
          (
            (eq? (first cmddef) prim-syntax-mark)
            (cond
              ( (eq? (second cmddef) 'quote)     (evquote-cmd) )
              ( (eq? (second cmddef) 'cond)      (evcond-cmd) )
              ( (eq? (second cmddef) 'set!)      (evset-cmd) )
              ( (eq? (second cmddef) 'letrec)    (evletrec-cmd) )
              ( (eq? (second cmddef) 'lambda)    (evlambda-cmd) )
              ( #t                               (undefined-cmd) )
            )
          )
          ( #t  (evcall-cmd) )
        )
      )
    )

    (undefined-cmd
      (lambda ()
        (set! result '***undefined***)
      )
    )

    (evconst-cmd
      (lambda ()
        (set! result exp)
      )
    )

    (evvar-cmd
      (lambda ()
        (set! result (bound-value (lookup exp (car envstk))))
      )
    )

    (evquote-cmd
      (lambda ()
        (set! result (second exp))
      )
    )

    (evlambda-cmd
      (lambda ()
        (set! result (make-proc exp (car envstk)))
      )
    )

What follows is the workings of the procedure call. Recursion has not been eliminated, due to ev-cmd procedure calls. The while expression is used to flatten argument evaluation and expression sequencing.

    (evcall-cmd
      (lambda ()
        (set! argstk (cons '() argstk))
        (evargs-cmd)
        (set! result (car argstk))
        (set! argstk (cdr argstk))

        (set! cmddef (car result))
        (set! arglist (cdr result))
        (evapply-cmd)
      )
    )

    (evargs-cmd
      (lambda ()
        (set! rawargstk (cons exp rawargstk))
        (while (not (null? (car rawargstk)))
          (set! exp (car (car rawargstk)))
          (ev-cmd)
          (set-car! argstk (append (car argstk) (list result)))
          (set-car! rawargstk (cdr (car rawargstk)))
        )
        (set! rawargstk (cdr rawargstk))
      )
    )

    ;;; cmddef = procedure
    ;;; arglist = evaluated argument list
    (evapply-cmd
      (lambda ()
        (cond
          ( (eq? (first cmddef) mach-mark)   (mach-call-cmd) )
          ( (eq? (first cmddef) prim-mark)   (prim-call-cmd) )
          ( (eq? (first cmddef) proc-mark)   (proc-call-cmd) )
          ( #t                               (undefined-cmd) )
        )
      )
    )

    (mach-call-cmd
      (lambda ()
        (set! result (xeq1 (second cmddef) arglist))
      )
    )

    (prim-call-cmd
      (lambda ()
        ; no Scheme written primitives yet
        (undefined-cmd)
      )
    )

    (proc-call-cmd
      (lambda ()
        (set! envstk (cons (add-args (procformals cmddef) arglist (procenv cmddef)) envstk))
        (set! seq (procbody cmddef))
        (evseq-cmd)
        (set! envstk (cdr envstk))
      )
    )

    (evseq-cmd
      (lambda ()
        (set! seqstk (cons seq seqstk))
        (undefined-cmd)
        (while (not (null? (car seqstk)))
          (set! exp (car (car seqstk)))
          (ev-cmd)
          (set-car! seqstk (cdr (car seqstk)))
        )
        (set! seqstk (cdr seqstk))
      )
    )

Extremely few recursions occur without encountering conditionals, so we will want to look at the processing of conditionals, as well…

    (evcond-cmd
      (lambda ()
        (set! condstk (cons (cdr exp) condstk))
        (cond
          ( (null? (car condstk))  (undefined-cmd) )
          ( #t                     (set! exp (car (car (car condstk))))
                                   (ev-cmd)
          )
        )
        (while (not result)
          (set-car! condstk (cdr (car condstk)))
          (cond
            ( (null? (car condstk))  (undefined-cmd) )
            ( #t                     (set! exp (car (car (car condstk))))
                                     (ev-cmd)
            )
          )
        )
        (set! seq (cdr (car (car condstk))))
        (evseq-cmd)
        (set! condstk (cdr condstk))
      )
    )

It’s still possible to add a second entry point into this recursive set of procedures via evapply-cmd. Note that, unlike eval, an environment does not need to be stacked before invoking the chain of -cmd procedures. Due to the fact that Scheme procedures are closures, the interpreted procedure will provide the initial environment.

    (apply
      (lambda (p args)
        (applier-cmd p args)
      )
    )

    (applier-cmd
      (lambda (p args)
        (set! cmddef p)
        (set! arglist args)
        (evapply-cmd)
        result
      )
    )

At which point, we can go back to eval creating a closure for apply to use…

    (eval
      (lambda (x env)
        (apply
          (make-proc
            (list 'lambda '() x)
            env
          )
          '()
        )
      )
    )

Flattening some recursion

Thursday, August 14th, 2008

The next transformation of the evaluator involves the trick of introducing Scheme code into the command stream. We want to transform the evseq-stk procedure call so that its task of handling expression sequences is handled by the iterative command loop instead of the recursion we have been using up till now.

We have an environment stack manipulation that is interfering with our transformation. What we do is turn it into a primitive, and then insert a call to the primitive by putting the call on the “next” stack. Since the stacked call must have the procedure in object form, we insert ( <proc> ) into the stack, where <proc> is in raw object form, (prim-mark <identifier>). The equivalent compiler process is generation of the extra “bookkeeping” code, regardless of whether it’s native code or interpretive code.

    (prim-call-seq
      (lambda ()
        (cond
          ( (eq? (second cmddef) 'pop-env) (pop-env) )
          ( #t                             (set! result '***undefined***) )
        )
      )
    )

    (proc-call-seq
      (lambda ()
        (set! result (add-args (procformals cmddef) arglist (procenv cmddef)))
        (set! envstk (cons result envstk))
        ;; push auxiliary continuation
        (set! next (cons (list '(prim-mark pop-env)) next))
        (set! result (evseq-stk '***undefined*** (procbody cmddef)))
;        (set! envstk (cdr envstk))
      )
    )

    (pop-env
      (lambda ()
        (set! envstk (cdr envstk))
      )
    )

Then each kind of expression gets its own procedure. In addition to ev-cmd and evsyntax-cmd, we show the procedures that don’t require evaluation of any “arguments”. The functions which aren’t converted to imperative form don’t manipulate the explicit stacks that will be used to implement recursive evaluation.

    (ev-cmd
      (lambda ()
        (set! cmd (car next))
        (set! next (cdr next))
        (cond
          ( (constant? cmd)     (evconst-cmd) )
          ( (symbol? cmd)       (evvar-cmd) )
          ( (symbol? (car cmd)) (evsyntax-cmd) )
          ( #t                  (evcall-cmd) )
        )
        (cond
           ( (null? next)
             (set! seqstk (cdr seqstk))
             (set! next (car seqstk))
          )
        )
      )
    )

    (evsyntax-cmd
      (lambda ()
        (set! cmddef (bound-value (lookup-stk (car cmd))))
        (cond
          (
            (eq? (first cmddef) prim-syntax-mark)
            (cond
              ( (eq? (second cmddef) 'quote)  (evquote-cmd) )
              ( (eq? (second cmddef) 'cond)   (evcond-cmd) )
              ( (eq? (second cmddef) 'set!)   (evset-cmd) )
              ( (eq? (second cmddef) 'letrec) (evletrec-cmd) )
              ( (eq? (second cmddef) 'lambda) (evlambda-cmd) )
              ( #t                            (evundefined-cmd) )
            )
          )
          ( #t (evcall-cmd) )
        )
      )
    )

    (evundefined-cmd
      (lambda ()
        (set! result '***undefined***)
      )
    )

    (evconst-cmd
      (lambda ()
        (set! result cmd)
      )
    )

    (evvar-cmd
      (lambda ()
        (set! result (bound-value (lookup cmd (car envstk))))
      )
    )

    (evquote-cmd
      (lambda ()
        (set! result (second cmd))
      )
    )

    (evlambda-cmd
      (lambda ()
        (set! result (make-proc-stk cmd))
      )
    )

Command loop

Monday, August 11th, 2008

I have added a while iteration expression to my base interpreter. It’s not a part of Scheme, but the do expression was too complex for my purposes.

So far, the expositions of continuations that I have seen require a step that includes infinite tail calls. That is a problem when you don’t have tail call optimization – the stack can grow to the point where you can’t finish solving the problem.

We will be storing procedure calls on the “next” stack. Because apply receives procedure objects and evaluated arguments, the procedure calls will be stored with these evaluated components. The apply procedure combines its arguments and places the reconstituted procedure call on the “next” stack.

The while expression implements the “command loop”. It parses the procedure call at the top of the “next” stack, pops off the procedure call, and then calls evapply-seq to finish the procedure call.

    (apply
      (lambda (p args)
        (set! cmd (cons p args))
        ;; put evaluated cmd on "next" stack
        (set! next (list cmd))
        (while (not (null? next))
          (set! cmd (car next))
          (set! cmddef (car cmd))
          (set! arglist (cdr cmd))
          (set! next (cdr next))
          (evapply-seq)
        )
        result
      )
    )

The evapply-seq procedure determines whether to call a machine language primitive, call a Scheme written primitive, or process a procedure object.

    (evapply-seq
      (lambda ()
        (cond
          ( (eq? (first cmddef) mach-mark)   (mach-call-seq) )
          ( (eq? (first cmddef) prim-mark)   (prim-call-seq) )
          ( (eq? (first cmddef) proc-mark)   (proc-call-seq) )
          ( #t                               (set! result '***undefined***) )
        )
      )
    )

    (mach-call-seq
      (lambda ()
        (set! result (xeq1 (second cmddef) arglist))
      )
    )

    (proc-call-seq
      (lambda ()
        (set! result (add-args (procformals cmddef) arglist (procenv cmddef)))
        (set! envstk (cons result envstk))
        (set! result (evseq-stk '***undefined*** (procbody cmddef)))
        (set! envstk (cdr envstk))
      )
    )

Unifying eval and apply

Sunday, August 10th, 2008

The eval and apply procedures currently call two mutually recursive procedures, ev-stk and evapply.

Recasting the apply arguments to forms usable by eval requires quoting the procedure object and the arguments, to prevent re-evaluation. The quoted procedure and arguments would be combined and submitted to eval (or its evaluator) as a procedure call expression.

Recasting the eval arguments to a form usable by apply requires creating a procedure (with no arguments) containing the S-expression. The procedure would then be submitted to apply (or its evaluator) with no arguments. In effect, we convert an expression like

(car x)

into the following procedure call…

( (lambda () (car x)) )

The second option seemed easier, and the new code has eval calling the procedure constructor directly, and then passing the constructed procedure to apply.

    (eval
      (lambda (x env)
        (apply
          (make-proc
            (list 'lambda '() x)
            env
          )
          '()
        )
      )
    )

    (apply
      (lambda (p args)
        (evapply p args)   ; procedure contains its own environment
      )
    )

Explicit argument stacking

Tuesday, August 5th, 2008

The argument stacking can also be made explicit. The evapply procedure call is wrapped in another procedure, evcall. The procedure evcall receives the entire procedure call form. A stack level for the argument list is created, and when the argument list is completed, it is taken off the stack. As per Scheme, the “operator” is evaluated as if it were an argument expression. The operator is valid only if it’s a procedure object, primitive or otherwise.

    (ev-stk
      (lambda (x)
        (cond
          ( (constant? x)      x )
          ( (symbol? x)        (bound-value (lookup-stk x)) )
          ( (symbol? (car x))  (evsyntax-stk (bound-value (lookup-stk (car x))) x) )
          ( #t                 (evcall x)
          )
        )
      )
    )

    (evcall
      (lambda (x)
        (set! argstk (cons '() argstk))
        (evargs x)
        (set! result (car argstk))
        (set! argstk (cdr argstk))
        (evapply (car result) (cdr result))
      )
    )

The evargs procedure evaluates each argument and adds it to the current argument list. Unlike the other evaluator procedures, it does not return a useful result. Instead, it uses a side effect (mutation) to perform its function.

    (evargs
      (lambda (x)
        (cond
          ( (null? x)   '() )
          ( #t          (set! result (ev-stk (car x)))
                        (set-car! argstk (append (car argstk) (list result)))
                        (evargs (cdr x))
          )
        )
      )
    )

Explicit environment stack

Saturday, August 2nd, 2008

I’ve modified the code to make the environment stack management explicit. We start with the eval procedure, which is basically a wrapper for the real evaluator. It puts the initial evaluation environment on the environment stack.

    (eval
      (lambda (x env)
        (set! envstk (cons env envstk))   ; push
        (set! result (ev-stk x))
        (set! envstk (cdr envstk))        ; pop
        result
      )
    )

The evaluator uses the current environment (top of environment stack) to look up variable values, evaluate nonprocedural (syntax) expressions, and evaluate procedure arguments. The “operator” of a procedure call is also evaluated with the current environment.

    (ev-stk
      (lambda (x)
        (cond
          ( (constant? x)      x )
          ( (symbol? x)        (bound-value (lookup-stk x)) )
          ( (symbol? (car x))  (evsyntax-stk (bound-value (lookup-stk (car x))) x) )
          ( #t                 (evapply (ev-stk (car x)) (evlist-stk (cdr x))) )
        )
      )
    )

    (lookup-stk
      (lambda (v)
        (assq v (car envstk))
      )
    )

    (evlist-stk
      (lambda (x)
        (cond
          ( (null? x)   '() )
          ( #t          (cons (ev-stk (car x)) (evlist-stk (cdr x))) )
        )
      )
    )

No change was needed to evapply, which calls proc-call. The environment captured by lambda is passed to proc-call. The old environment must be saved, in case another argument or an expression in a sequence is the next to be evaluated after the given procedure call. Together, the captured environment and argument bindings form the environment to be used by the procedure body.

    (proc-call
      (lambda (body env formals args)
        (set! envstk (cons (add-args formals args env) envstk))
        (set! result (evseq-stk '***undefined*** body))
        (set! envstk (cdr envstk))
        result
      )
    )

    (evseq-stk
      (lambda (x seq)
        (cond
          ( (null? seq)   x )
          ( #t            (evseq-stk (ev-stk (car seq)) (cdr seq)) )
        )
      )
    )

Of the syntax expressions currently implemented, only letrec needs stacking of the environment. Argument variables did not need to be defined until after the arguments were evaluated. However, the letrec variables must be defined and added to the environment before the initializing expressions are evaluated.

    (evsyntax-stk
      (lambda (syn-def x)
        (cond
          (
            (eq? (first syn-def) prim-syntax-mark)
            (cond
              ( (eq? (second syn-def) 'quote)     (second x) )
              ( (eq? (second syn-def) 'cond)      (evcond-stk (cdr x)) )
              ( (eq? (second syn-def) 'set!)      (evset
                                                    (lookup-stk (second x))
                                                    (ev-stk (third x))
                                                  )
              )
              ( (eq? (second syn-def) 'letrec)    (set! envstk
                                                    (cons
                                                      (append (evletvar (second x)) (car envstk))
                                                      envstk
                                                    )
                                                  )
                                                  (set! result
                                                    (evletrec-stk
                                                      (second x)
                                                      (cdr(cdr x))
                                                    )
                                                  )
                                                  (set! envstk (cdr envstk))
                                                  result
              )
              ( (eq? (second syn-def) 'lambda)    (make-proc-stk x) )
              ( #t                                '***undefined*** )
            )
          )
          ( #t  (evapply syn-def (evlist-stk (cdr x))) )
        )
      )
    )

    (make-proc-stk
      (lambda (form)
        (cons proc-mark (cons (car envstk) (cdr form)))
      )
    )

    (evcond-stk
      (lambda (x)
        (cond
          ( (null? x)              '***undefined*** )
          ( (ev-stk (car (car x))) (evseq-stk '***undefined*** (cdr (car x))) )
          ( #t                     (evcond-stk (cdr x)) )
        )
      )
    )

    (evletrec-stk
      (lambda (bindings seq)
        (evletbind (car envstk) (evletexp-stk bindings))
        (evseq-stk '***undefined*** seq)
      )
    )

    (evletexp-stk
      (lambda (bindings)
        (cond
          ( (null? bindings)   '() )
          ( #t                 (cons
                                 (ev-stk (second (car bindings)))
                                 (evletexp-stk (cdr bindings))
                               )
          )
        )
      )
    )