Not-so-new-anymore backquote implementation for CLISP

In the spring of 2003, I developed a new implementation of the backquote syntax for CLISP. This was originally available available as a patch against CLISP 2.30 and against the CLISP CVS trunk (2003-04-26). Not long afterward, the patch has now been integrated into the CLISP CVS, and further hacked on by Bruno Haible and others.

This backquote implementation eliminates a number of defects that are present in the original, and also provides a list-based target language, allowing CLISP programs to create data structures that represent backquote source code.

Macro-based expander

The big difference is that the backquote expansion is no longer done by the backquote reader, but by the backquote macro. As a result, the backquote macro no longer has a second argument which gives it the expansion; it computes the expansion. This means that expansions can be obtained by generating the macro syntax. In the unpatched CLISP, if you generate a backquote macro call without providing your own expansion as a second argument which that macro can just regurgitate, it won't work!

Unpatched CLISP:

  (let ((subst "foo")) 
    (system::backquote (system::unquote subst)))

  ==> NIL
What? Okay, now let's fool it into giving the value 42 by specifying the missing second argument:
  (let ((subst "foo")) 
    (system::backquote (system::unquote subst) 42))

  ==> 42
As you can see, the system::backquote macro does nothing other than just spit out the second argument. The first argument exists for pretty-printing, the second for evaluation. The new implementation does not work this way: the backquote macro has just one argument, and it performs expansion on it:
  (let ((subst "foo")) 
    (system::backquote (system::unquote subst)))

  ==> "foo"
Aha! Technically, this isn't a bugfix because ANSI CL doesn't require backquotes to be backed by a sane target language, but it's nice to have. There are real bugs fixed by this patch too.

CLISP backquote bugs addressed

CLISP sometimes generates generates a system::backquote form in which the second argument is incorrect (so the pretty-printing of the form back to the backquote read syntax is wrong) but a correct second argument (so the evaluation is correct). For instance:
  ``,,`,3  ==> `,NIL
  (eval ``,,`,3) ==> 3
Here is a case which reproduces the other type of bug: correct first argument (good pretty printing) but incorrect second argument (bad evaluation):
  ;;; expected result:  (FOO `(BAR (BAZ 'A A) (BAZ 'B B) (BAZ 'C C)))
  ;;; CLISP result:     (FOO `(BAR NIL NIL NIL NIL))

  (defun breaks-on-clisp ()
    (let ((list '(a b c d)))
      `(foo `(bar ,@',(mapcar #'(lambda (sym)
				  `(baz ',sym ,sym))
			      list)))))
The innermost backquote `(baz ',sym ,sym) is mis-expanded somehow into code that just returns NIL. The new implementation handles this case perfectly.

The old implementation can cause multiple evaluation of the unquoted forms. For example, evaluating the expression

  ````,',',',(print 42)
causes 42 to be printed four times! This behavior is gone.

Implementation differences

One noteable difference in the new implementation is that the splicing unquotes have different syntax, quite simply ,@FORM corresponds to (SPLICE FORM) rather than (UNQUOTE (SPLICE FORM)). Code which depends on this detail will break, but then again code which generates backquote syntax for CLISP cannot work without my patch anyway.

Another difference is that the expansion is done recursively. Because a backquote macro performs the expansion, if the expanded source is another invocation of the same macro, the expansion is repeated. The upshot is that for instance ````,,,,x macro-expands (or should we say macro-reduces?) to the form x. This is actually correct; applying four evaluations to x produces the right result. Users who are used to expansions that are interleaved with evaluations may be confused, because they might expect the first evaluation of the form to produce ```,,,[value-of x], and then the next round ``,,[value-of [value-of x]] and so forth. Only one comma and backquote cancel out on each evaluation round, in addition to the re-evaluation of the form that started out as x. This kind of implementation agrees with the intuition which says that the unquote commas ``perform'' the substitution, as a backquote is ``stripped away''. This is in fact not necessarily how it works; the backquote form can rather generate code which performs that substitution when it is evaluated, and all nested backquotes may also be reduced to code, rather than being preserved as backquotes that require additional expansion at each evaluation round.

Lastly, the new implementation has an optimizer that can be turned off by assigning the value NIL to system::*backquote-optimize*. This will cause the implementation to spit out naive APPEND-based code, closely based on the Common Lisp HyperSpec's semantic descriptions. The implementation works by generating the naive code according to the spec in order to get the semantics right, and then by optimizing that code through semantics-preserving transformations (meaning that the optimized code generates a list that is EQUAL to what the unoptimized code would produce). This time-honored pipelined approach to compiler construction allows for better bug isolation; if there appears to be a bug, and it goes away when the optimizer is turned off, then it's almost certainly a bug in the optimizer. If the same buggy behavior persists with or without optimization, then it's in the intermediate code generation.