Subject: Re: Is Greenspun enough?
From: rpw3@rpw3.org (Rob Warnock)
Date: Sat, 10 Dec 2005 20:15:48 -0600
Newsgroups: comp.lang.lisp
Message-ID: <eLadnfJhwdxJFgbenZ2dnUVZ_sKdnZ2d@speakeasy.net>
Duane Rettig  <duane@franz.com> wrote:
+---------------
| Ulrich Hobelmann <u.hobelmann@web.de> writes:
| > George Neuner wrote:
| >>>>> mmap()ed files are also cached, no?
| >>>> No.  Mapped files are handled by the virtual memory system and all
| >>>> modern systems DMA pages directly to/from disk with no buffering.
| >>> Seriously?  I thought every common OS would buffer/cache most pages...
| >> Seriously, you need to read up on MMUs, virtual memory and demand
| >> paging systems.
| >
| > What I (think I) know about modern OSes is that they cache everything
| > from disk, I suppose by associating file blocks with in-memory-blocks.
| > Surely I could be wrong...
+---------------

Ulrich isn't wrong, George. The "global page cache" has pretty much
completely replaced the "file block cache" in recent versions of the
VM systems of most decent operating systems [e.g., Irix, *BSD, Linux].
Pretty much *everything* is cached using the same mechanism [which
does have its downsides sometimes, to be sure.].

+---------------
| > like to know where you got your information that mmap()ed data isn't
| > cached at all, because I don't believe it.
| 
| The caching sometimes happens at first access, and not at mmap time.
+---------------

True. In particular, these days executable files are simply mmap'd
into memory, and pages only fault in from the ELF (say) file as they
are first accessed.

+---------------
| It depends on whether there is backing for the mmap, or if it is
| "MAP_NORESERVE", which means that it is using "virtual swap" (which is
| different than virtual memory).  There is also copy-on-write options
| which sometimes create new pages but which don't write them back out -
| this allows many processes to map the same file without disturbing the
| original, and yet without incurring the extra virtual page cost of a
| private mapping; any pages which have not been written to can use the
| same actual page across process boundaries.
+---------------

Quite true! And in fact, the page caching in both FreeBSD & Linux is
so good that even though CMUCL's typical image file is *huge* compared
to CLISP's, on the second (and subsequent) executions, CMUCL starts up
slightly *faster* than CLISP!!  [Aside: Sam, I haven't tested this on
the latest version of CLISP, so my apologies if it's no longer true.]
The following was done with a laptop running FreeBSD-4.10 on a 1.8 GHz
Athlon with 1 GiB of RAM [but a *slow* disk]:

    $ cat test_clisp.lisp
    #!/usr/local/bin/clisp -q
    (format t "hello world!~%")
    $ time-hist ./test_clisp.lisp
    Timing 100 runs of: ./test_clisp.lisp
       3 0.019
      32 0.020
      65 0.021
    $ cat test_cmucl.lisp
    #!/usr/local/bin/cmucl -script
    (format t "hello world!~%")
    $ time-hist ./test_cmucl.lisp
    Timing 100 runs of: ./test_cmucl.lisp
      59 0.016
      41 0.017
    $ 

The point is *not* that one is a few milliseconds faster than the
other, but that on such a system "bigger" is not necessarily "slower". 

[A secondary point is that CMUCL is perfectly acceptable for simple
"scripting", *including* low-traffic CGI scripting! For high-traffic
sites, of course, one would use compiled code in a persistent Lisp
application server.]


-Rob

-----
Rob Warnock			<rpw3@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607