Subject: Re: Back to character set implementation thinking
From: Erik Naggum <erik@naggum.net>
Date: Tue, 26 Mar 2002 06:06:55 GMT
Newsgroups: comp.lang.lisp
Message-ID: <3226111629531727@naggum.net>

* cr88192 <cr88192@hotmail.com>
| sorry, I don't really know of byte sizes other than 8...
| am I missing something?

  Yes.  A "byte" is only a contiguous sequence of bits in a machine word,
  and has been used that way by most vendors, for us notably DEC, which
  contributed the machine instructions we know as LDB and DPB and the
  notion of a byte specifier, which has bit position in word and length in
  bits.  Failure to support LDB and DPB in hardware is very costly for a
  large number of useful operations, but on an a byte-addressable world
  with 8-bit bytes, using anything smaller than bytes that might cross byte
  boundaries has serious penalties.  In a word-addressable world, this
  saves a lot of memory, even relative to the byte-adressable machines.  C
  has bit fields because it was intended to run on Honewyell 6000, which
  had 36-bit words, so its "char" was 9 bits wide.  (See page 34 of
  Kernighan & Ritchie, 1st ed.)

  IBM chose a more specific terminology: 4-bit nybbles (the same spelling
  deviation as "byte" from "bite"), 8-bit bytes, 16-bit half-words, 32-bit
  words, and 64-bit double-words.  On the PDP-10, we had 36-bit words,
  18-bit half-words (and halfword instructions), but bytes were all over
  the place.  I knwo several people who think this is a much better design
  than the stupid 8-bit design we have today.  Sadly, only several, not
  millions and millions who think Intel's designs are better just because
  they can buy them.

///
-- 
  In a fight against something, the fight has value, victory has none.
  In a fight for something, the fight is a loss, victory merely relief.