From ...
From: Erik Naggum <clerik@naggum.no>
Subject: Re: Vector help
Date: 1998/02/19
Message-ID: <3096859997279323@naggum.no>
X-Deja-AN: 326500088
References: <MPG.f551f0c75b5b6d3989684@newsreader.wustl.edu>
mail-copies-to: never
Organization: Naggum Software; +47 8800 8879; http://www.naggum.no
Newsgroups: comp.lang.lisp


* Kevin Goodier
| I have a list of length 32, and I'm going to be altering a couple of
| elements in it multiple times (meaning I'll be making many many copies of
| it).  I'm thinking that a simple vector would be the most efficient (i'm
| looking for FASTEST) way to go, but I can't for the life of me find a
| built-in function to convert a list to a vector.  Is there an easy and
| efficient way to do this??

  I think

(make-array 32 :initial-contents <list>)

  is the fastest and easiest way to convert a list of known length to a
  vector of known size.  I also have reason to expect

(make-array (length <list>) :initial-contents <list>)

  to be faster than the other suggestions in the unknown length case.

| The list is being given to me, so I can't just create a vector instead 
| (unfortunately).  Is a simple vector going to be significantly faster?

  well, there's faster and there's faster.  accessing element n in a list
  is an O(n) operation in the general case, whereas accessing element n in
  a vector is an O(1) operation.  the special case of accessing element n+1
  when you last accessed element n is O(1) in both cases if you code well.

  however, there are a few minor issues that affect the constants of these
  upper limits to execution time.  stepping over a list with DOLIST will
  need to keep track of two values, your variable and an (opaque) variable
  that holds the tail, and you will need to make two memory references, but
  all the parameters of these operations are known, so you will not make
  any function calls to obtain the CAR and CDR of the successive CONS cells
  -- these accessors will be open-coded, the test for NIL is trivial, and
  the two values typically go in registers.  stepping over a vector of
  unknown size with DOTIMES will need to keep track of the index into the
  vector, test it against the size (with a function call), and will have to
  do generic (as opposed to type-specialized) addition every time the index
  is incremented, which is likely a function call.  on top of this, you are
  likely to force a function call to AREF to access the element, which will
  make another test for the size of the array, forcing several more memory
  references in this "faster" case.  it may turn out that you could have
  accesses two or three successive elements in a list by the time you have
  accessed the first slot in a vector, at least as far as the traversal
  skeleton is concerned.

  if you want the most out of a vector traversal, you need to declare the
  index to be a fixnum, declare the type and your best effort at the
  dimension of the vector, and use the more specialized SVREF instead of
  the more generic AREF if you can.  then you need to compile the code with
  a high SPEED and probably a low SAFETY setting to force open-coding of
  all of these functions.  in this case it will be as fast as the hardware
  can bear, possibly even taking advantage of instructions that decrement a
  counter register and branch on zero instead of incrementing, comparing,
  and branch on equal.  trust me, all of this stuff adds up real quick.

  e.g., Allegro CL 4.3.1 expand (dotimes (i 100) (aref <vector> i)) into

(do ((i 0 (1+ i)))
    ((>= i 100))
  (declare (type (integer 0 100) i))
  (aref <vector> i))

  but (dotimes (i (length <vector>)) (aref <vector> i)) is expanded into

(do ((#:g140 (length <vector>))
     (i 0 (1+ i)))
    ((>= i #:g140))
  (aref <vector> i))

  so you would need to add your declaration to get equal code.

  speed in Lisp is a difficult issue.  it's hard to predict when the
  compiler knows or doesn't know enough to generation the fastest available
  code.  sometimes, what works in the default cause is very fast, such as
  in DOLIST, which knows all it needs to know, and sometimes it is not at
  all fast, such as in the DOTIMES/AREF case, where the compiler needs to
  know far more about your intentions than it is possible to communicate
  with these simple forms.

  however, if you're lucky, LOOP is heavily optimized in your compiler and
  may do the right thing in both cases.  LOOP/IN traverses a list and
  LOOP/ACROSS traverses a vector.  however, I don't know of a way to make
  LOOP/ACROSS use SVREF instead of AREF when I know that the vector is
  simple.  any takers?

  anyway, if you are concerned with speed, you have to consider your
  hardware and your compiler, which to some people seem "un-Lispy", but
  that is only because it is more natural to write correct code first than
  to optimize the living daylight out of incorrect code.  the compiler
  cannot know your intentions unless you tell it, but it is free to assume
  a lot worse than you might expect.  (and strong typing is _not_ the
  answer -- it's much worse to pretend to illuminate the compiler when
  you're still in the dark yourself than to keep the compiler in the dark
  with you.)  most Lisp compilers come with profilers, call counters, and
  other tools to help you decide when, where, and how to optimize.  this
  way you can make an informed estimate of the expected performance gains
  instead of flying by the seat of your pants.  that said, we've had a few
  very interesting exchanges in this newsgroup in the past that highlighted
  the vastly different approaches to speeding up Lisp code, so don't assume
  that there's a "last word" on this.

  hope this helps.

#:Erik
-- 
  God grant me serenity to accept the code I cannot change,
  courage to change the code I can, and wisdom to know the difference.