From ... Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!colt.net!newsfeed00.sul.t-online.de!t-online.de!newsfeed.esat.net!nslave.kpnqwest.net!nloc2.kpnqwest.net!nloc.kpnqwest.net!nmaster.kpnqwest.net!nreader1.kpnqwest.net.POSTED!not-for-mail Newsgroups: comp.lang.lisp Subject: Re: Back to character set implementation thinking References: <1016831590.163240@haldjas.folklore.ee> <3226061533844203@naggum.net> <87ofhczdat.fsf_-_@becket.becket.net> <3226095271716329@naggum.net> Mail-Copies-To: never From: Erik Naggum Message-ID: <3226111629531727@naggum.net> Organization: Naggum Software, Oslo, Norway Lines: 30 User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.1 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Tue, 26 Mar 2002 06:06:55 GMT X-Complaints-To: newsmaster@KPNQwest.no X-Trace: nreader1.kpnqwest.net 1017122815 193.71.199.50 (Tue, 26 Mar 2002 07:06:55 MET) NNTP-Posting-Date: Tue, 26 Mar 2002 07:06:55 MET Xref: archiver1.google.com comp.lang.lisp:30142 * cr88192 | sorry, I don't really know of byte sizes other than 8... | am I missing something? Yes. A "byte" is only a contiguous sequence of bits in a machine word, and has been used that way by most vendors, for us notably DEC, which contributed the machine instructions we know as LDB and DPB and the notion of a byte specifier, which has bit position in word and length in bits. Failure to support LDB and DPB in hardware is very costly for a large number of useful operations, but on an a byte-addressable world with 8-bit bytes, using anything smaller than bytes that might cross byte boundaries has serious penalties. In a word-addressable world, this saves a lot of memory, even relative to the byte-adressable machines. C has bit fields because it was intended to run on Honewyell 6000, which had 36-bit words, so its "char" was 9 bits wide. (See page 34 of Kernighan & Ritchie, 1st ed.) IBM chose a more specific terminology: 4-bit nybbles (the same spelling deviation as "byte" from "bite"), 8-bit bytes, 16-bit half-words, 32-bit words, and 64-bit double-words. On the PDP-10, we had 36-bit words, 18-bit half-words (and halfword instructions), but bytes were all over the place. I knwo several people who think this is a much better design than the stupid 8-bit design we have today. Sadly, only several, not millions and millions who think Intel's designs are better just because they can buy them. /// -- In a fight against something, the fight has value, victory has none. In a fight for something, the fight is a loss, victory merely relief.