From ... From: Erik Naggum Subject: Re: 8-bit input (or, "Perl attacks on non-English language communities!") Date: 1999/02/11 Message-ID: <3127701651609043@naggum.no>#1/1 X-Deja-AN: 443109655 References: <79p88l$8h2$1@news.u-bordeaux.fr> <873e4fbawi.fsf@2xtreme.net> mail-copies-to: never Organization: Naggum Software; +47 8800 8879; http://www.naggum.no Newsgroups: comp.lang.lisp * cbarry@2xtreme.net (Christopher R. Barry) | Hmmm... MULL (MUlti Lingual Lisp)? give me a break. Common Lisp has all it needs to move to a smart wide character set such as Unicode. we even support external character set codings in the :EXTERNAL-FORMAT argument to stream functions. it's all there. all the stuff that is needed to handle input and output should also be properly handled by the environment -- if not, there's no use for such a feature since you can neither enter nor display nor print Unicode text. | This isn't going to make users of 8-bit character sets experience | increased storage overhead for the exact same string objects and a | performance hit in string bashing functions, now is it? there are performance reasons to use 16 bits per character over 8 bits in modern hardware already, but if you need only 8 bits, use BASE-STRING instead of STRING. it's only a vector, anyway, and Common Lisp can already handle specialized vectors of various size elements. if it is important to separate between STRING and BASE-STRING, I'm sure a smart implementation would do the same for strings as the standard does for floats: *READ-DEFAULT-FLOAT-FORMAT*. | On the upside, unicode support could give an additional excuse for Lisp's | apparent "slowness" in certain situations. In my Java class the | instructor seems to always bring up unicode support as part of the excuse | for Java's lousy performance (hmm... this isn't really comforting for | some reason though...). criminy. can teachers be sued for malpractice? if so, go for it. #:Erik -- Y2K conversion simplified: Januark, Februark, March, April, Mak, June, Julk, August, September, October, November, December.