Subject: Re: Implementation of CL data structures
From: rpw3@rpw3.org (Rob Warnock)
Date: Fri, 21 Jul 2006 22:03:30 -0500
Newsgroups: comp.lang.lisp
Message-ID: <3LSdnQW_V-UfCFzZnZ2dnUVZ_t2dnZ2d@speakeasy.net>
funkyj <funkyj@gmail.com> wrote:
+---------------
| Pascal Bourguignon wrote:
| > Lars <no@spam.please> writes:
| > > Here are some very crude figures, the number of Lisp and C source
| > > files in three OS CL implementations:
| > >        CLISP  CMUCL  SBCL
| > > .[ch]    429    130   139
| > > .lisp    189    855   786
| 
| > And I've got the impression that it's the exception to implement Lisp
| > in C.  Of course, there's always a few C files, for the interface with
| > the POSIX or UNIX VM, but most implementations of lisp are implemented
| > in lisp.
| 
| Is that the case with CLisp, CMUCL and/or SBCL?  Are the C files in
| those CLs only for interfacing with external C libraries?
+---------------

Since it only provides a byte-compiler, CLISP uses much more C than
the other two -- most of the control & data "primitives". Still, quite
a lot of CLISP is written in CL [e.g., the compiler, FORMAT, etc.].

But almost *all* of CMUCL & SBCL -- *including* control & data
"primitives" -- is written in Common Lisp, which is hard-compiled
directly to machine code which is then included in the startup
heap image -- ~24 MB, for CMUCL-19c. The C files are used only to
build the *much* smaller -- ~260 KB, for CMUCL-19c -- executable
file which "loads" (mmap's, really) the heap image and then jumps
to it. The C code also contains a relatively small amount of machine
and/or OS architecture-dependent routines[2] (e.g., how to enter
and return from an interrupt and the related "setsigcontext" stuff,
how to install/remove a machine-language breakpoint, or do a system
call), some and -- last but not least! -- the garbage collector.
All in all, barely 1% of the bits in a freshly loaded copy of CMUCL
were coded in C. [SBCL should be similar.]


-Rob

[1] Including the compiler itself, which is why bootstrapping
    or rebuilding one of these is "interesting".  ;-}  Though
    at least SBCL can now use some *other* CL implementation
    besides itself for such bootstrapping. [CMUCL badly needs
    that feature too, IMHO.]

[2] This is why "number of C files" is a very bad metric for
    CMUCL & SBCL, since there are many copies of the "same"
    files in the distribution, only one of each particular kind
    needed for a given OS and/or machine architecture, e.g.:

       $ cd cmucl-19c/src/lisp
       $ ls *arch*
       alpha-arch.c  arch.h       mips-arch.c  search.c  sparc-arch.c
       amd64-arch.c  hppa-arch.c  ppc-arch.c   search.h  x86-arch.c
       $ ls *os*
       Config.alpha_osf1  Linux-os.h    hpux-os.h    os.h
       Darwin-os.c        NetBSD-os.c   irix-os.c    osf1-os.c
       Darwin-os.h        NetBSD-os.h   irix-os.h    osf1-os.h
       FreeBSD-os.c       OpenBSD-os.c  mach-os.c    solaris-os.c
       FreeBSD-os.h       OpenBSD-os.h  mach-os.h    sunos-os.c
       Linux-os.c         hpux-os.c     os-common.c  sunos-os.h
       $ ls *lispregs*
       alpha-lispregs.h  hppa-lispregs.h  mips-lispregs.h  sparc-lispregs.h
       amd64-lispregs.h  lispregs.h       ppc-lispregs.h   x86-lispregs.h
       $ 

    Therefore, to get a reasonable "LOC" comparison between CL & C
    source would require counting only the files needed for *one*
    complete machine/OS pair (e.g., x86/Linux).

-----
Rob Warnock			<rpw3@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607