Subject: Re: stack allocation of environment
From: (Rob Warnock)
Date: 2 May 2001 08:49:04 GMT
Newsgroups: comp.lang.scheme
Message-ID: <9cohm0$4t496$>
Ji-Yong D. Chung <> wrote:
|     A minor questions on your approach
|     L2 cache is 256k for my Pentium III.  I'd assume that the
| nursery has to be somewhat smaller than that...

Depends on the organization of the cache and the other access patterns
(e.g., is it 1-way, [direct-mapped], 2-way or 4-way set-associative, is
it a unified I- and D-cache or separate, etc.), but a good size to try
might be the same as or (perhaps) half the size of the cache. In any case,
I suggest experimentation with various sizes rather than trying to predict
it up-front.

| I don't know, since L2 cache maybe shared by other processes on my system).

That, too.

| This means lots of copying for my scheme objects, because the collector
| will be triggered quite often as the nursey repeatedly fills up.
| Doesn't this incur extra penalty?

Aha! That's exactly what it *doesn't* do! The whole point of generational
collectors (of which this is the simplest case) is the assumption that
most objects "die" (become inaccessible) very soon after they're created
(e.g., temporary intermediate values, perhaps continuations), so that when
the nursery fills up most of the objects that have been allocated in it
are already dead, so that when you do a minor collection only the very
few "live" (accessible) objects in the nursery need to be copied out to
the current "to" space.

[Now when "to" space finally fills up and you do a full collection, then
*all* live data will need to be copied, but that happens much less often.]

| I have tried to decrease my heap size to be less than L2 cache size.
| From what I can remember,  this degraded the performance.  I am guessing
| that the cost of copying is more than offsetting the gains made by
| reducing the number of cache misses.

Very likely. Plus, you can't just arbitrarily limit the maximum heap.
What happens if you run an inherently large problem?

| Of course, just reducing the heap size is different from your...

(Not "mine". It's a well-known strategy. See any survey of GC techniques.
Especially look for the name Ungar.)

| ... 3 heap-space design.  My test does indicate that increased copying
| does cost extra CPU cycles.

Actually, when it works as intended, generational collection (of which
the 3-space version is an instance) significantly *decreases* copying.


p.s. If the main heap semispaces ("to" & "from") typically contain a *lot*
of live data (much larger than the size of the nursery), then you might
want to consider maintaining an explicit "remembered set" (that is, the
set of "to"-space objects that have had pointers to nursery objects stored
into them) to treat as roots instead of treating the entire used to-space
as roots.  But try the latter first; it's simpler.

Rob Warnock, 31-2-510
SGI Network Engineering		<URL:>
1600 Amphitheatre Pkwy.		Phone: 650-933-1673
Mountain View, CA  94043	PP-ASEL-IA