Subject: Re: Dynamic function generation, HOW?
From: rpw3@rpw3.org (Rob Warnock)
Date: Sun, 08 Oct 2006 22:22:03 -0500
Newsgroups: comp.lang.lisp
Message-ID: <aL2dnXFq_KfGXbTYnZ2dnUVZ_ridnZ2d@speakeasy.net>
<sankymoron@gmail.com> wrote:
+---------------
| When programming in C/C++, atleast I have a conceptual
| idea about how function calls and stacks and heap etc. work.
+---------------

There's an old in-joke that goes: "C programmers know the cost
of everything and the value of nothing; Lisp programmers know
the value of everything and the cost of nothing."[1]

In the immortal words of Dennis Ritchie, "You are not expected
to understand this."[2]  At least, not quite yet. ;-}  ;-}
But if you keep going with Lisp, you will, eventually.

+---------------
| Its just that I would like to know how things like passing and
| returning functions as data, returning multiple values etc happen
| at the low level.
+---------------

As reasonable-sounding as that question may seem to you, the
reason you aren't going to get a simple answer to it is that
almost every Lisp system does it slightly (or *completely*!!)
differently, often (but not always) for completely valid reasons
having to do with tradeoffs the implementation has made between
performance, cross-platform portability, need (or not) for ease
of FFI with other languages, style of GC, and many, many other
design dimensions.

As others have advised, books such as Queinnec's "Lisp In Small
Pieces", or SICP, or Norvig's PAIP -- especially the chapters
in each on compiling and virtual machine design -- are well
worth the effort of studying if you're really serious about
the subject.

Object representations are one of the areas where various Lisp
implementations vary most widely. David Gudeman's 1993 (but still
very useful) survey of object representations & tagging is well
worth reading:

    ftp://ftp.cs.indiana.edu/pub/scheme-repository/doc/pubs/typeinfo.ps.gz
    "Representing Type Information in Dynamically Typed Languages"

But if you just want a rough mental performance model, you won't go
very wrong if you consider that each of these actions [at least, in
compiled code] has a roughly the same (small) cost:

- A function call.
- A dynamic (special) binding.
- Allocating a heap object ("consing"). [Number of allocations
  is usually much more important than object size. Usually.]
- One iteration of a loop.
- The READ'ing of one token [if READ is used at run-time].

Of course, evaluating in-lined accessor functions [when the
compiler has been given enough information to do so] is cheaper
than full-calling a generic version, but you can/should ignore
that for now.

In fact, an even simpler approximation [still surprisingly
accurate] is:

- A function call has fixed cost.
- *Everything* is a function call.  ;-}

And at this stage, you're probably better off using the built-in
TIME macro than DISASSEMBLE to figure out how long things take.
Probably the very first thing you're going to discover is that
in most implementations there's a *LOT* of difference between
running code interpreted and compiled:

    > (defun delay (n)
	(dotimes (i n)))

    DELAY
    > (time (delay 1000000))
    ; Compiling LAMBDA NIL: 
    ; Compiling Top-Level Form: 

    ; Evaluation took:
    ;   3.97f0 seconds of real time
    ;   3.917419f0 seconds of user run time
    ;   0.00228f0 seconds of system run time
    ;   7,364,217,098 CPU cycles
    ;   [Run times include 0.16f0 seconds GC run time]
    ;   0 page faults and
    ;   48,027,344 bytes consed.
    ; 
    NIL
    > 

Yikes! 7364 CPU clock cycles per iteration. Of course, compiling
makes things a *lot* better:

    > (compile 'delay) 
    ; Compiling LAMBDA (N): 
    ; Compiling Top-Level Form: 

    DELAY
    NIL
    NIL
    > (time (delay 1000000))
    ; Compiling LAMBDA NIL: 
    ; Compiling Top-Level Form: 

    ; Evaluation took:
    ;   0.01f0 seconds of real time
    ;   0.012837f0 seconds of user run time
    ;   0.0f0 seconds of system run time
    ;   25,088,483 CPU cycles
    ;   0 page faults and
    ;   0 bytes consed.
    ; 
    NIL
    > 

Hmmm... That time's too small to be reliable, so crank up
the number of iterations:

    > (time (delay 100000000))
    ; Compiling LAMBDA NIL: 
    ; Compiling Top-Level Form: 

    ; Evaluation took:
    ;   1.22f0 seconds of real time
    ;   1.203355f0 seconds of user run time
    ;   0.0f0 seconds of system run time
    ;   2,258,783,996 CPU cycles
    ;   0 page faults and
    ;   0 bytes consed.
    ; 
    NIL
    > 

Just over 22 CPU cycles/step. Even this can be *substantially*
improved if the compiler can be told when it's safe to use
trickier/simpler code:

    > (defun delay (n)
	(if (fixnump n)
	  (locally
	    (declare (fixnum n) (optimize (speed 3) (safety 0)))
	    (dotimes (i n)))
	  (dotimes (i n))))
    > (compile 'delay) 
    ; Compiling LAMBDA (N): 
    ; Compiling Top-Level Form: 

    DELAY
    NIL
    NIL
    > (time (delay 500000000))
    ; Compiling LAMBDA NIL: 
    ; Compiling Top-Level Form: 

    ; Evaluation took:
    ;   0.56f0 seconds of real time
    ;   0.533203f0 seconds of user run time
    ;   0.006685f0 seconds of system run time
    ;   1,026,612,414 CPU cycles
    ;   0 page faults and
    ;   0 bytes consed.
    ; 
    NIL
    > 

Only a hair over 2 CPU cycles/step. Depending on the CPU,
even C might not be able to do better.


-Rob

[1] The second part is one of the famous Alan Perlis epigrams,
    #55 at <http://www.cs.yale.edu/quotes.html>; the complementary
    first half seems to have arisen spontaneously among the
    collective consciousness of The Internet:

      http://ars.userfriendly.org/users/read.cgi?id=23837&tid=110389
      http://lambda-the-ultimate.org/node/663#comment-6205
      http://groups.google.com/group/comp.lang.lisp/msg/8f0854e6cf820d33
      http://developers.slashdot.org/comments.pl?sid=50632&cid=5082205
      http://www.j-bradford-delong.net/movable_type/2004_archives/000941.html
      http://ll2.ai.mit.edu/talks/proebsting.ppt
      http://www.info.ucl.ac.be/~pvr/potpourri.html
      http://www.c2i.ntu.edu.sg/AI+CI/Humor/aiquotes_L.html

    A related popular quote states:

      C is the high-level language that combines the power
      of assembler with the ease of use of assembler.

[2] http://cm.bell-labs.com/cm/cs/who/dmr/odd.html

-----
Rob Warnock			<rpw3@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607