Subject: Re: LISP vs HASKELL vs PROLOG
From: rpw3@rpw3.org (Rob Warnock)
Date: Mon, 13 Jul 2009 19:41:45 -0500
Newsgroups: comp.lang.lisp
Message-ID: <z-ydnR_NeaXUScbXnZ2dnUVZ_hWdnZ2d@speakeasy.net>
[Cross-posting trimmed to groups I read...]

Larry D'Anna  <larry@elder-gods.org> wrote:
+---------------
| Jon Harrop <jon@ffconsultancy.com> wrote:
| > Larry D'Anna wrote:
| >> Johannes Laire <johannes.laire@gmail.com> wrote:
| >>>> That is O(n) with GHC.
| >>>
| >>> It is O(1) for unboxed arrays.
| >> 
| >> I thought it was O(1) for boxed arrays too, but I tested it and defiantly
| >> is not.  Any idea why boxed vs. unboxed should make a difference?
| >
| > The write barrier incurred when mutating a boxed element in an array dirties
| > the whole array and that causes the GC to traverse the whole array at every
| > minor/gen0 GC, which is periodic (separated by constant amounts of time for
| > the purposes of this description). That dirtying is not necessary for
| > unboxed elements so you do not see the asymptotic performance degradation
| > with them, e.g. the STUArray of Int in your example.
| 
| Why in the world does it have to dirty the whole array?  Is there any sane
| reason for that or is this just an unfortunate deficiency of GHC?  I just
| can't believe that an array update is O(n).  But but my tests updating a
| STArray really is O(n) in GHC.  I'm a bit shocked.
+---------------

If GHC were using a generational GC with a software card-marking
write barrier, then updating a single element in an array of boxed
elements should at most dirty a card's worth of elements. According
papers I've read on this, good "card" sizes these days are ~256 bytes
[or at least somewhere in the 128-1024 byte range], which on a 32-bit
machine would mean an extra 64 elements to scan during the first GC
following the update [but possibly no extra work thereafter, if the
updated value gets promoted into the same generation as the array].
This would therefore be an O(1) cost w.r.t. to the size of the array,
though O(c) for "c" the number of distinct cards updated. [That is,
doing a zillion updates to all of the elements of one card costs
no more GC overhead than doing a single update to that same card.]

So all you're saying is that GHC isn't using that kind of GC.  ;-}


-Rob

-----
Rob Warnock			<rpw3@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607