Subject: Re: Could CDR-coding be on the way back?
From: Erik Naggum <erik@naggum.net>
Date: 13 Dec 2000 03:31:55 +0000
Newsgroups: comp.lang.lisp,comp.arch
Message-ID: <3185667115752812@naggum.net>

* Per Bothner <per@bothner.com>
| >   If your data structures have unpredictable structure, use lists.
| 
| If your data structures have unpredictable structure, you may be
| in trouble ...

  Really?  Reading Lisp data from input streams is well-defined today,
  despite an absence of predictability of the objects you read in.  It
  works for both lists and arrays, but arrays are very inefficient
  because you don't know their size a priori, and copying elements like
  crazy as the list grows is much less efficient than grabbing cons
  cells from a pool of them.  The way Lisp readers usually deal with
  arrays is in fact to build a list from the input and then convert the
  list to an array.  You could process the data twice, once to count
  elements and once to actually store them in memory.  You could rely on
  someone else not lying about the number of elements in the array,
  which would make the data format impossible to edit by hand.  Etc.

  I recommend programmers in non-list-supporting languages to implement
  the lists in the protocol syntaxes I have defined as arrays for one
  good reason: Such languages seldom have automatic memory management,
  either, so abandoning a reference causes memory leaks.  If they keep
  track of the array, they usually avoid the issue of keeping track of
  all the individual elements.  I have been kind to them by specifying
  that no list will be longer than 32 elements, so allocating all arrays
  that big remains a possibility.  Much of the implementor feedback has
  been the difficulty in reading lists of objects of unpredictable types
  and my recommendation not to convert to an internal type until you
  need it, but instead keep the string form with a prefix character (the
  equivalent of a type tag) have resulted in some pretty weird code.
  
| Erik, you know that I am older than you, and have probably been
| programming at least as long as you.  Don't you imagine I might have
| quite a bit of experience with both arrays *and* lists?  (I have less
| experience with Common Lisp than you, I'm sure though.)  I think my
| share of "painfully idiotic programming mistakes" is modest, yet I
| have not found that my "implementation really must use two elements
| per "array"".

  You recommend arrays to other people, not just to yourself, unless I'm
  _gravely_ mistaken about the purpose of your article.  Other people
  who have much less experience than you and rely on your experience in
  order to form their own, with the least possible pain.  My experience
  with people who try to implement things with arrays is that they end
  up with very cons-like concepts once they outgrow the C "memory is
  really an array of bytes" idea.  The reason is simple: Arrays with the
  properties we discuss are very, very hard to implement correctly.

  If we _are_ talking about what a compiler designer would do for a
  language that natively supports these things in such a way that
  programmers would never _see_ the array, the kind of reasoning you
  have provided is completely misplaced.  I have therefore assumed that
  you do _not_ argue about primarily what a compiler writer would do,
  although CDR-coding _is_ at that level.

| Assume a modest one word of overhead per object.  Then an extensible
| array with three elements takes 3 words for the array header (header +
| current length + pointer to data array) plus 4 words for the data
| array (1 word of header plus three words of data) or 7 words total.  A
| non-extensible array would only need 4 words.  If you need a separate
| header word for the array allocated length, add one more word yielding
| 8 or 5 words, respectively.  A list of three cons cells take 9 words.
| Already for length three, the array wins.

  I think you need to explain how you ended up with 9 words for the
  list.  There is of course no header word for cons cells if you are
  going to implement a very basic idea in a language with this.  Type
  information is stored much more efficiently if you care about these
  things, or you allocate them in blocks that only hold cons cells.
  That is, arrays of cons cells, if you want.  Likewise, your arrays
  would have type information in a similarly more efficient way.

  Here's how I count.  First, the data array.  It has one word per
  element, but it is aligned at and always allocate in some useful
  chunk, like 4 words.  This is to make allocation significantly
  cheaper.  Since the data array may need to be moved when grown, there
  is a meta-object that contains the allocation size, the displacement
  offset, the active size, and a pointer to the data array, or 4 words,
  one allocation unit.  The pointer to an allocation unit would of
  course have four spare bits that could be used for type information,
  just as pointers to cons cells have three spare bits for type
  information, avoiding the header word overhead completely for the most
  important builtin types.  I need 4 cons cells to match an array of 4
  elements.

| If you use a memory allocation scheme not requiring any header words,
| then the extensible 3-element array takes 6 words, the non-extensible
| array takes 4 words, and the list version takes 6 words.  The array
| is equal to the list in space.

  Yes, but only at enormous expense in allocation costs with such a fine
  granularity.  Also, you really do not want your array to move around
  if it grows, or you lose the idea of identity of lists.  Holding the
  size without a pointer to data that can move means you always have to
  copy the array if you do anything at all to it.  This means that you
  cannot perform any operations on the list implemented this way, only
  on the elements of the fixed-size list.  One of the charming issues
  about lists is that they can grow so easily.  If we are to give them a
  _more_ efficient implementation, you can't take that away from them.

| Unless you use cdr-coding, you will need 2 pointers plus any
| per-object overhead (which may be zero) for each list element.  This
| is twice as much space as using an array.  This only wins for short
| lists, or for extensible size arrays whose size are just past the
| doubling threshhold.

  The overhead is twice as much _after_ the overhead of arrays have been
  dealt with, including that copying-when-doubling scheme, which out of
  necessity must allocate memory for the same element many times over as
  the list grows.  By cleverly allocating allocation units for the data
  array from a single pool that is not used for anything else, the same
  way cons cells are allocated, you can grow your list painlessly and
  linearly without copying until you allocate another array.  You would
  typically inform the allocator that you are allocating for a growing
  list if you did this, howeer, to arrange for a large section of that
  space to be temporarily reserved for the growing array.  It would
  revert to normal allocation once you inform the allocator that you
  hare through growing that particular list.  This cuts down on copying,
  on reallocating, and speeds up just about everything while growing the
  lists linearly, at the cost of a little bit more explicit code and
  knowledge sharing between allocator and user.  In a language with
  support for these things you would want that, even if it would mostly
  be used by system functions that build lists as return values.

  If you get the impression that I have implemented this scheme for
  real, that's exactly right.

| Because it is convenient, not necessarily because it is efficient.

  When processing data that produces values of unknown lengths, and I
  find myself doing that with an alarmingly high frequency compared to
  values of known lengths, the growing pain of the array approach has
  costs that do not win in the long run.  If you usually process data
  that produces values of known lengths with a similarly much higher
  frequency than unknown lengths, you still have much higher overhad
  than if you avoided the cleverness in the array implementation that
  would take care of the growing.  Hence, if you actually implement this
  and compare approaches and test efficiency at a product-quality level,
  lists made up of cons cells actually do win the efficiency contest,
  especialy with a really fast cons cell allocator, and this not the
  least because of the internal proximity of the data, which beats that
  of the array that has been moved around several times while the data
  it pointed to stayed where it was first stored (unless you implement
  different memory pools for array allocation units and data, in which
  case you don't get that added cost).

  The interesting thing is, however, what the languages look like after
  you decide to represent lists as arrays that way.  E.g., cdr-ing down
  a list can easily be emulated with displaced arrays, i.e., arrays that
  really use another array's data for its elements, but starting from an
  offset for elemen t0.  (Hence the importance of storing type bits in
  the pointer to the allocation unit for meta-object and data array, but
  keeping them of the same size and from the same allocation pool.)  You
  may find that cdr-ing down an array allocates meta-objects, while
  cdring down a list does not, because the _rest_ of an array is no
  longer just a pointer to a cons cell, but a new displaced array.  If
  you want to push and pop elements on lists, you find that you want
  lists that grow an shrink on different ends, leading to different
  coding of many algorithsm, or you can be really clever and build
  the list using displaced arrays that fill the allocation unit from
  right to left and move them around instead.  Many cool things that
  affect what is efficient and convenient in surprising ways once you
  start using it.

  I did most of this work for a large automotive distributor back in the
  late 80's when I had one of my many "pity I can't find a Lisp I can
  use" periods and had major use for things Lisp makes easy.  The system
  was in use until they went from a dial-in terminal-based system over
  both X.25 and regular phone lines to a web-based thing.  I was greatly
  amused when the new system cost 50 times more money to build (adjusted
  for inflation and computer geek costs), required 30 times more memory,
  processing power, and bandwidth to handle the same load, and almost 20
  times more code to get the same work done.  My system paid for itself
  in 6 months of operation.  I doubt this new system ever will pay for
  itself before it is replaced or run into maintenance costs that will
  keep it in the red for the rest of its life, but it's hot and flashy
  and the users are very happy with the nicely laid-out web pages.  The
  users are also less efficient using it, though.  Modern technology!
  But I digress nostalgically.

#:Erik
-- 
  "When you are having a bad day and it seems like everybody is trying
   to piss you off, remember that it takes 42 muscles to produce a
   frown, but only 4 muscles to work the trigger of a good sniper rifle."
								-- Unknown