Chapter 2. The Compiler

Table of Contents
Error Messages
The Compiler's Handling of Types
Compiler Policy
Open Coding and Inline Expansion

This chapter will discuss most compiler issues other than efficiency, including compiler error messages, the SBCL compiler's unusual approach to type safety in the presence of type declarations, the effects of various compiler optimization policies, and the way that inlining and open coding may cause optimized code to differ from a naive translation. Efficiency issues are sufficiently varied and separate that they have their own chapter.

Error Messages

The compiler supplies a large amount of source location information in error messages. The error messages contain a lot of detail in a terse format, so they may be confusing at first. Error messages will be illustrated using this example program:

(defmacro zoq (x)
  `(roq (ploq (+ ,x 3))))

(defun foo (y)
  (declare (symbol y))
  (zoq y))
The main problem with this program is that it is trying to add 3 to a symbol. Note also that the functions roq and ploq aren't defined anywhere.

The Parts of the Error Message

When processing this program, the compiler will produce this warning:

file: /tmp/foo.lisp

in: DEFUN FOO
  (ZOQ Y)
--> ROQ PLOQ + 
==>
  Y
caught WARNING:
  Result is a SYMBOL, not a NUMBER.
In this example we see each of the six possible parts of a compiler error message:

  1. File: /tmp/foo.lisp This is the name of the file that the compiler read the relevant code from. The file name is displayed because it may not be immediately obvious when there is an error during compilation of a large system, especially when with-compilation-unit is used to delay undefined warnings.

  2. in: DEFUN FOO This is the definition top level form responsible for the error. It is obtained by taking the first two elements of the enclosing form whose first element is a symbol beginning with "def". If there is no such enclosing "def" form, then the outermost form is used. If there are multiple def forms, then they are all printed from the outside in, separated by =>'s. In this example, the problem was in the defun for foo.

  3. (ZOQ Y) This is the original source form responsible for the error. Original source means that the form directly appeared in the original input to the compiler, i.e. in the lambda passed to compile or in the top level form read from the source file. In this example, the expansion of the zoq macro was responsible for the error.

  4. --> ROQ PLOQ + This is the processing path that the compiler used to produce the errorful code. The processing path is a representation of the evaluated forms enclosing the actual source that the compiler encountered when processing the original source. The path is the first element of each form, or the form itself if the form is not a list. These forms result from the expansion of macros or source-to-source transformation done by the compiler. In this example, the enclosing evaluated forms are the calls to roq, ploq and +. These calls resulted from the expansion of the zoq macro.

  5. ==> Y This is the actual source responsible for the error. If the actual source appears in the explanation, then we print the next enclosing evaluated form, instead of printing the actual source twice. (This is the form that would otherwise have been the last form of the processing path.) In this example, the problem is with the evaluation of the reference to the variable y.

  6. caught WARNING: Result is a SYMBOL, not a NUMBER. This is the explanation of the problem. In this example, the problem is that y evaluates to a symbol, but is in a context where a number is required (the argument to +).

Note that each part of the error message is distinctively marked:

  • file: and in: mark the file and definition, respectively.

  • The original source is an indented form with no prefix.

  • Each line of the processing path is prefixed with -->

  • The actual source form is indented like the original source, but is marked by a preceding ==> line.

  • The explanation is prefixed with the error severity, which can be caught ERROR:, caught WARNING:, caught STYLE-WARNING:, or note:.

Each part of the error message is more specific than the preceding one. If consecutive error messages are for nearby locations, then the front part of the error messages would be the same. In this case, the compiler omits as much of the second message as in common with the first. For example:

file: /tmp/foo.lisp

in: DEFUN FOO
  (ZOQ Y)
--> ROQ
==>
  (PLOQ (+ Y 3))
caught STYLE-WARNING:
  undefined function: PLOQ

==>
  (ROQ (PLOQ (+ Y 3)))
caught STYLE-WARNING:
  undefined function: ROQ
In this example, the file, definition and original source are identical for the two messages, so the compiler omits them in the second message. If consecutive messages are entirely identical, then the compiler prints only the first message, followed by: [Last message occurs repeats times] where repeats is the number of times the message was given.

If the source was not from a file, then no file line is printed. If the actual source is the same as the original source, then the processing path and actual source will be omitted. If no forms intervene between the original source and the actual source, then the processing path will also be omitted.

The Original and Actual Source

The original source displayed will almost always be a list. If the actual source for an error message is a symbol, the original source will be the immediately enclosing evaluated list form. So even if the offending symbol does appear in the original source, the compiler will print the enclosing list and then print the symbol as the actual source (as though the symbol were introduced by a macro.)

When the actual source is displayed (and is not a symbol), it will always be code that resulted from the expansion of a macro or a source-to-source compiler optimization. This is code that did not appear in the original source program; it was introduced by the compiler.

Keep in mind that when the compiler displays a source form in an error message, it always displays the most specific (innermost) responsible form. For example, compiling this function

(defun bar (x)
  (let (a)
    (declare (fixnum a))
    (setq a (foo x))
    a))
gives this error message
in: DEFUN BAR
  (LET (A) (DECLARE (FIXNUM A)) (SETQ A (FOO X)) A)
caught WARNING: The binding of A is not a FIXNUM:
  NIL
This error message is not saying "there is a problem somewhere in this let" - it is saying that there is a problem with the let itself. In this example, the problem is that a's nil initial value is not a fixnum.

The Processing Path

The processing path is mainly useful for debugging macros, so if you don't write macros, you can probably ignore it. Consider this example:

(defun foo (n)
  (dotimes (i n *undefined*)))
Compiling results in this error message:
in: DEFUN FOO
  (DOTIMES (I N *UNDEFINED*))
--> DO BLOCK LET TAGBODY RETURN-FROM
==>
  (PROGN *UNDEFINED*)
caught STYLE-WARNING:
  undefined variable: *UNDEFINED*
Note that do appears in the processing path. This is because dotimes expands into:
(do ((i 0 (1+ i)) (#:g1 n))
    ((>= i #:g1) *undefined*)
  (declare (type unsigned-byte i)))
The rest of the processing path results from the expansion of do:
(block nil
  (let ((i 0) (#:g1 n))
    (declare (type unsigned-byte i))
    (tagbody (go #:g3)
     #:g2    (psetq i (1+ i))
     #:g3    (unless (>= i #:g1) (go #:g2))
             (return-from nil (progn *undefined*)))))
In this example, the compiler descended into the block, let, tagbody and return-from to reach the progn printed as the actual source. This is a place where the "actual source appears in explanation" rule was applied. The innermost actual source form was the symbol *undefined* itself, but that also appeared in the explanation, so the compiler backed out one level.

Error Severity

There are four levels of compiler error severity: error, warning, style warning, and note. The first three levels correspond to condition classes which are defined in the ANSI standard for Common Lisp and which have special significance to the compile and compile-file functions. These levels of compiler error severity occur when the compiler handles conditions of these classes. The fourth level of compiler error severity, note, is used for problems which are too mild for the standard condition classes, typically hints about how efficiency might be improved.

Errors During Macroexpansion

The compiler handles errors that happen during macroexpansion, turning them into compiler errors. If you want to debug the error (to debug a macro), you can set *break-on-signals* to error. For example, this definition:

(defun foo (e l)
  (do ((current l (cdr current))
       ((atom current) nil))
      (when (eq (car current) e) (return current))))
gives this error:
in: DEFUN FOO
  (DO ((CURRENT L #) (# NIL)) (WHEN (EQ # E) (RETURN CURRENT)) )
caught ERROR: 
  (in macroexpansion of (DO # #))
  (hint: For more precise location, try *BREAK-ON-SIGNALS*.)
  DO step variable is not a symbol: (ATOM CURRENT)

Read Errors

SBCL's compiler (unlike CMU CL's) does not attempt to recover from read errors when reading a source file, but instead just reports the offending character position and gives up on the entire source file.