Subject: Re: reducing number consing
From: rpw3@rpw3.org (Rob Warnock)
Date: Mon, 22 Dec 2003 05:21:17 -0600
Newsgroups: comp.lang.lisp
Message-ID: <1oydnS6dGuAwTnuiRVn-tA@speakeasy.net>
Joe Marshall  <prunesquallor@comcast.net> wrote:
+---------------
| Luke Gorrie <luke@bluetail.com> writes:
| > It also appears that on some systems (i.e. my laptop) the garbage
| > collector causes a great deal of kernel work. Smarter use of system
| > calls may fix this, e.g. coalescing mprotect()s of adjacent pages. 
| 
| Some experiments have shown that checking the write-barrier in
| software by doing a conditional branch on every write can outperform
| hardware checking based on mprotect. 
+---------------

I remember reading some papers about that a while back. One in particular,
<URL:http://citeseer.nj.nec.com/hosking92comparative.html>, examined
various kinds of write barriers and concluded that a software write
barrier using card marking (with fairly small cards, 256-512 bytes)
out-performed kernel/VM page-based hardware write barriers:

	The page trapping scheme performed poorly in comparison to
	card marking. Interestingly, this does not appear to be due to
	the overhead of fielding page traps, since that is included in
	running time, which was not significantly higher (and often lower)
	than in the card marking schemes. Rather, it is because pages
	are too large a granule so they miss the optimum card size.

That is, a single write means that the whole page gets "dirtied"; you have
to scan the entire page at the next GC. Whereas with a card-marking system
you only have to scan the (presumably much smaller) card that was dirtied.

	We can summarize the conclusions as follows: a card size of 256
	or 512 bytes appears optimal for card marking on this hardware;
	page trapping was surprisingly effective, but is not the best
	scheme because its granularity is too large; and remembered sets[1]
	are best overall.


-Rob

[1] "Remembered sets" are yet another way of doing software write barriers,
    providing an even more precise record of the significant stores:

	...associates a remembered set with each generation, recording
	the objects or locations in older generations that may contain
	pointers into that generation. Any pointer store that creates
	a reference from an older generation to a younger generation
	is recorded in the remembered set for the younger generation.
	At scavenge time the remembered sets for the generations being
	scavenged include the heap root set for the scavenge.
	...
	Our remembered sets are implemented as circular hash tables
	using linear hashing. ...

-----
Rob Warnock			<rpw3@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607