Archive for October, 2007

Persistent Lisp definitions

Monday, October 29th, 2007

The LSM gives us one model of a state machine in functional form. We will use it to define “global” functions in an applicative (eager evaluation) functional style.

The global environment is part of the m-state. Adding named functions to it is done by creating a binding and making a new global environment, which is passed to the next state to become the new global environment. An LSM level command (def <name> <expression>) is recognized by the generator for the next m-state. The def command binds <name> to the unevaluated <expression>, and adds the binding to the global environment. The result value is the empty list.

next-m-state =
  (lambda (m-state i-state)
    (cond
      ((eq (car i-state) (quote def))  (cons
                                         (quote ())
                                         (add-to-env
                                           (make-binding
                                             (car (cdr i-state))
                                             (car (cdr (cdr i-state)))
                                           )
                                           (cdr m-state)
                                         )
                                       )
      )
      (t                               (cons
                                         (eval i-state (cdr m-state))
                                         (cdr m-state)
                                       )
      )
    )
  )

add-to-env =
  (lambda (binding env)
    (cons binding env)
  )

make-binding =
  (lambda (var value)
    (cons var value)
  )

The following script will test that a globally defined variable can be evaluated as a variable, or as a function if it is bound to a function definition…

(def f1 (lambda (a b) (cons a b)))
f1
(f1 (quote (x y z)) (quote (a b c)))

My base (assembly language) system still requires that nonprimitive function definitions be named in the a-list used as an argument to eval0. When the LSM interpreter is invoked, the remaining input is interpreted under the environment set by the LSM interpreter, rather than the environment of the base system. The LSM interpreter promotes the mixing of evaluation and persistent definition, whereas the base system promotes the separation of evaluation and persistent definition.

Lisp REPL and state machines

Saturday, October 27th, 2007

My original implementation of REPL looks like the following…

repl =
  (lambda (x)
    (repl (print (eval (read) (quote ()))))
  )

We can abstract this a little bit by noting that it represents the creation of an output based on an input.

repl =
  (lambda (x)
    (repl (make-o-state (get-i-state)))
  )
get-i-state =
  (lambda ()
    (read)
  )
make-o-state =
  (lambda (i-state)
    (print (eval i-state (quote ())))
  )

Here, I am introducing the notion of state, so we might as well look at a state machine. A state machine models an environment that has a machine state. At each sampling of the input state, the machine state is combined with the input state to generate the next machine state. The generation of the next machine state Si+1, given the current machine state Si and input state xi, can be specified as a transformation function t

Si+1 = t(Si, xi)

Useful state machines also have an output function based on the machine state.

yi = f(Si)

The complexity of the state changes is the complexity of the transformation function t.

The state machine has an initial state S0 and the first input state is x0.

We can add a next machine state generator and make the o-state a function of the next m-state…

repl =
  (lambda (x)
    (repl (make-o-state (next-m-state m-state (get-i-state))))
  )
next-m-state =
  (lambda (m-state i-state)
    (eval i-state m-state)
  )
make-o-state =
  (lambda (m-state)
    (print m-state)
  )

The above code shows the intent of using eval to create a new m-state. It also shows that the m-state is intended to be the environment for eval. We need to show that the m-state includes both the eval environment and the result of eval. For this purpose, the m-state is implemented as (result . env).

next-m-state =
  (lambda (m-state i-state)
    (cons (eval i-state (cdr m-state)) (cdr m-state))
  )
make-o-state =
  (lambda (m-state)
    (print (car m-state))
  )

To show that the o-state can potentially affect the external world, we make it an argument of the i-state sampler…

repl =
  (lambda (o-state)
    (repl (make-o-state (next-m-state m-state (get-i-state o-state))))
  )
get-i-state =
  (lambda (o-state)
    (read)
  )

Now we just need to show that the o-state is not the next machine state…

repl =
  (lambda (m-state)
    (repl (next-m-state m-state (get-i-state (make-o-state m-state))))
  )

As REPL is no longer the same function, we will rename the new function as the Lisp state machine, or LSM…

lsm =
  (lambda (m-state)
    (lsm (next-m-state m-state (get-i-state (make-o-state m-state))))
  )

Lisp bindings abstracted

Tuesday, October 23rd, 2007

Detection of a non-builtin function and its application was originally coded like this in apply

((eq (car fn) (quote lambda)) (eval
                                (car (cdr (cdr fn)))
                                (pairlis (car (cdr fn)) args env)
                              )
)

We can abstract the argument binding somewhat by rewriting it this way…

((eq (car fn) (quote lambda)) (eval
                                (car (cdr (cdr fn)))
                                (mkenv (car (cdr fn)) args env)
                              )
)

...

mkenv =
  (lambda (vars vals env)
    (pairlis vars vals env)
  )

This still hides the fact that pairlis is actually two functions combined into one – binding values to argument variables in a pairing operation, and adding the pairs to the environment…

mkenv =
  (lambda (vars vals env)
    (extend-env (bind-args vars vals) env)
  )

extend-env =
  (lambda (local-env env)
    (append local-env env)
  )

bind-args =
  (lambda (vars vals)
    (cond
      ((null vars) nil)
      (t           (cons
                     (cons (car vars) (car vals))
                     (bind-args (cdr vars) (cdr vals))
                   )
      )
    )
  )

To further abstract bindings, atom lookup in eval is abstracted from

((atom form)   (cdr (assoc form env)))

to

((atom form)   (lookup form env))

...

lookup =
  (lambda (key env)
    (cdr (assoc key env))
  )

The result is that the structure of the table of bindings is no longer determined by eval. It is now determined by the three cooperating functions, bind-args, extend-env, and lookup. The eval function interface is now somewhat similar to the eval function of Scheme (R5RS, R6RS). No data structure is given for Scheme’s eval environment argument. The argument’s value type is simply called a specifier. It is interesting that Common Lisp (ANSI 1994) has apparently eliminated the environment argument from eval.

Tail call optimization deferred

Friday, October 12th, 2007

If we look at functions like the following version of evlis, tail call optimization appears to be just a simple case of deleting the arguments of the current call and then adding the arguments of the tail call. (The tail call in the following code is the call to cons.)

evlis =
   (lambda (x e)
     (cond
       ((null x) nil)
       (t        (cons (eval (car x) e) (evlis (cdr x) e)))
     )
   )

However, if we look at the revised version of evlis, the delete-and-add strategy would eliminate the desired definitions of free variables x and e. It’s also a reminder that if free variables exist in a function, the delete-and-add strategy would break dynamic scoping.

evlis =
   (lambda (x e)
     (cond
       ((null x) nil)
       (t        (
                   (lambda (val) (cons val (evlis (cdr x) e)))
                   (eval (car x) e);
                 )
       )
     )
   )

In the revised code above, moving the lambda expression out verbatim and giving it a function name won’t prevent breakage by the delete-and-add strategy – because of the free variables. So the tail call optimization is being deferred until I get a better understanding of how free variables should be treated.

Bootstrapping Lisp – a GC version, left-to-right evaluation, REPL

Monday, October 8th, 2007

With a working nonGC Lisp available, I went and coded a Lisp version of eval.

I also wrote some prototypical compiler code.  The clean code of the generated prototype code was a reminder of the value of simplicity.

So now it was “one more time” to get that GC version going again. The stack handling of the last series of GC builds was very erratic, and it was simplified. All saved values went to the same place on the stack, with no attempts to optimize the order of evaluation. Just the naive just-like-it’s-written evaluation order.

Great! Now I have a working Lisp with GC. The eval interpreter I wrote in Lisp for the non-GC Lisp works properly with the revamped GC Lisp.

The arguments of a function call are evaluated by a function called evlis. Now the obvious version of the evlis function, where x is the list of unevaluated arguments and e is the environment, is

evlis =
   (lambda (x e)
     (cond
       ((null x) nil)
       (t        (cons (eval (car x) e) (evlis (cdr x) e)))
     )
   )

Now it would appear that this definition enforces left-to-right evaluation, but it doesn’t! If the underlying machine evaluates right-to-left, then (evlis (cdr x) e) is evaluated before (eval (car x) e), meaning the tail is evaluated before the head.

We would really like to enforce the left-to-right order without relying on internals. We could try some kind of sequencing, something like the following code…

evlis =
   (lambda (x e)
     (cond
       ((null x) nil)
       (t        val := (eval (car x) e); (cons val (evlis (cdr x) e)))
     )
   )

It turns out that argument binding is similar to assignment, and the eager evaluation rule means all argument expressions are evaluated before any part of the function body is evaluated. Because of this, we can turn the temporary value val into an argument for an anonymous function, like so…

evlis =
   (lambda (x e)
     (cond
       ((null x) nil)
       (t        (
                   (lambda (val) (cons val (evlis (cdr x) e)))
                   (eval (car x) e);
                 )
       )
     )
   )

I don’t know how many texts have this version of evlis. I got this version from J. Allen’s Anatomy of Lisp.

Eager evaluation also allows us to create a REPL in the following manner…

repl =
   (lambda (x)
     (repl (print (eval (read) (quote ()))))
   )

With lazy evaluation, the argument x would never be evaluated because it is never used. But with eager evaluation, it performs as the expected REPL.

And what about the space cost of REPL recursion? After printing each result, it’s basically the cost of holding all the printed results (which are used as arguments to the REPL function) plus the overhead of storing the stack of arguments in heap space as a list. Because the assembly language interpreter is coded recursively, there is also the cost of storing a stack of return addresses that will never be used.

The current pieces of code do not perform very large computations nor provide long sessions of interactivity. Although memory consumption is not yet a concern, I did want to see if it was too early to implement tail-call-optimization (TCO).