From ... From: Erik Naggum Subject: Re: Reviews for lisp implementations Date: 1999/04/18 Message-ID: <3133417387581097@naggum.no>#1/1 X-Deja-AN: 467890546 References: <3714671D.136215D2@singnet.com.sg> <3715A6F9.51E830D0@simplex.nl> <3133160386747264@naggum.no> <37164EF3.F15982CB@simplex.nl> <7fa9eo$631$1@nnrp1.dejanews.com> <3133358604132917@naggum.no> <7fbt07$ffn$1@nnrp1.dejanews.com> mail-copies-to: never Organization: Naggum Software; +47 8800 8879; http://www.naggum.no Newsgroups: comp.lang.lisp * Vassil Nikolov | Correct me if I am wrong, but the above (quoted) paragraph does not | contradict a statement that using 15/15 for a printable character is | inappropriate. Or did I miss anything? yes. 10/0 and 15/15 are characters when the right-hand side of an 8-bit character set (GR) is filled with a 96-character set. (the other 32 are control characters (C1).) if you had filled it with a 94-character set, it would have been inappropriate to use 15/15 at all. the reason for this is that 10/0 and 15/15 are characters in their own right and must be coded with 8 bits, but if you use a shifting coding with only 7 bits and codes to swap between G0 and G1 (both now in GL) with the codes SO and SI, then it's important that 2/0 and 7/15 remain their usual semi-control characters even when G1 is invoked. | I don't understand your point here. seems I was mistaken about the up/downcasing of I with/without dots. (shoot, gotta check and go back and fix those files for Emacs.) | I wondered (as an academic exercise) what should CHAR-UPCASE and | NSTRING-UPCASE do about LATIN SMALL LETTER Y WITH DIAERESIS (assuming | STRING-UPCASE is allowed to return a longer string which isn't especially | nice either). Signal an error? Or the implementation would state that | the character sets it uses do not include this letter? (Making | CHAR-UPCASE return two values, like #\I and #\J in this case, appears | more than perverse, though who knows.) I have come to think that people who use sick writing systems should pay for their own mistakes so they will have reason to fix them. forcing everybody else to pay for them only causes software not to be available. e.g., the Spanish purportedly undid the silly sorting requirements of ll (treated as a separate "letter" between k and l, I think it was) due to the force of simplicity and logic of computers (or was it marketing :). a German spelling reform (which people seem to hate rather strongly) do away with the sharp s and spell it "ss" in lowercase, too. the Norwegian and Danish sillitude of sorting "aa" as equivalent to "å" (a ring), and the hysterical requirement that German spelled out with "ue" instead of "ü" should be sorted as if it wasn't spelled out are examples of morons who got into standards bodies. (now, the right way to do this is to store a sort key and a print string, but since people don't use tools easily extendible that way, forcing stupid people to do this causes a lot of grief and problems when they try to print the sort key or vice versa.) anyway, let's just ignore the issue and ask them to spell it out as ij, like the Dutch correctly do. (the ÿ is Belgian, _from_ Dutch ij.) (I'm not sure upcasing "ij" to "IJ" is all that great an idea, although it is obvious if you look at fonts designed in or for The Netherlands: they sport "ij" and "IJ" ligatures, just as fonts designed for Norway has a ligature for "fj" just like "fi", because of "fjord" and "fjell".) anyway. 8 bits would have been enough if we had been using floating diacritics and upcasing and downcasing would have needed to worry about A-Z, only. ISO tried that, too, (ISO 6937) but computer people were not able to appreciate it, because they were thinking fonts, not character sets. sigh. if there's reincarnation, I hope I won't remember any of this the next time around. #:Erik -- environmentalists are much too concerned with planet earth. their geocentric attitude prevents them from seeing the greater picture -- lots of planets are much worse off than earth is.