A small step for compiling Scheme

March 27th, 2009 by tk

This foray into compiling Scheme uses a stack that is stored in the heap, rather than in a conventional stack array or vector. The compiler will be “naive”, producing inefficient code.

I settled on using register EAX as the top of the evaluation stack, and with the rest of the stack in a linked list. A “fixed” node points to the stack via its CDR field, but does not hold anything in its CAR field. This works well with the implemented convention of calling machine language routines with both the argument list and the result held in EAX.

First, we show a code sequence for setting up code, compiling it, and then displaying the result as a “flattened” list.

  (letrec
    ((displayx
        (lambda (x)
          (cond
            ((symbol? (car x)) (display x) (newline))
            (#t (for-each displayx x)))))
      (scheme-code '()))

    (set! scheme-code '(fn #t #f #f))
    (set! code (compile scheme-code))
    (for-each displayx code))

We eventually want to compile procedures, but for now we will compile simple expressions. Boolean constants are pretty simple – we just combine the unchanging pieces of code and send the combination back as a result.

    (compile
      (lambda (exp)
        (compile-exp exp)))

    (compile-exp
      (lambda (exp)
        (display 'compiling-exp=) (display exp) (newline)
        (cond
          ((constant? exp) (compile-constant exp))
          ((symbol? exp) (compile-symbol exp))
          (#t (compile-proccall exp)))))

    (compile-constant
      (lambda (exp)
        (cond
          ((eq? exp #t) (list push-eval true-eval))
          ((eq? exp #f) (list push-eval false-eval))
          (#t '((error exp =constant-not-implemented=))))))

    (push-eval
      '((mov     edx HeapBase)               ; get heap base
        (mov     edx (edx offsetEVALSTACK offsetCDR))  ; arg2 = stack tail
        (mov     ecx eax)                    ; arg1 = item to push
        (call    m_cons1)
        (mov     edx HeapBase)               ; get heap base
        (mov     (edx offsetEVALSTACK offsetCDR) eax)  ; update (push) stack
      ))

    (true-eval
      '((mov     edx HeapBase)
        (lea     eax (edx offsetTRUE))))

    (false-eval
      '((mov     edx HeapBase)
        (lea     eax (edx offsetFALSE))))

Haven’t yet decided how to handle symbols, but we need to state what kind of code to produce and where it will be.

    (compile-symbol
      (lambda (exp)
        '((loadvar ***unknown***))))

Procedure calls are handled by compiling the subexpressions, followed by a currently undetermined protocol for calling closures.

The subexpression results go on the stack with the leftmost expression being the deepest. We then unstack and CONS the results, effectively reversing the subexpression results and detaching the results from the stack. The EAX register ends up referencing the constructed argument list. The number of arguments that are combined into a single list is controlled by the argument list recursion. The result is compatible with my xeq1 interface protocol.

Since the code we’re constructing is a nested data structure, and we’re parsing a nested data structure (as opposed to sequential text as in the list I/O), argument evaluation order of the compiler procedures is not important.

    (compile-proccall
      (lambda (exp)
        (list
          (compile-arglist exp)
          '(callclosure***))))

    (compile-arglist
      (lambda (exp)
        (cond
          ((null? exp) (list push-eval emptylist-eval))
          (#t (list
              (compile-exp (car exp))
              (compile-arglist (cdr exp))
              cons-eval)))))

Leave a Reply