Subject: Re: struggling with READ-CHAR and LOOP (accumulating digits)
From: Erik Naggum <erik@naggum.net>
Date: 30 Oct 2000 20:50:25 +0000
Newsgroups: comp.lang.lisp
Message-ID: <3181927825349367@naggum.net>

* Kent M Pitman <pitman@world.std.com>
| As a style thing, I don't.  I regard PEEK-CHAR to be definitionally at the
| mercy of the langauge designers, and I never use it for anything that is 
| not synchronized to likewise be at the mercy of the language designers.

  Well, I consider the definition of "whitespace" to be a language
  issue, so I agree with your reasoning, but the conclusion is the
  opposite because you apparently believe that what's whitespace is an
  environmental issue, in which case it is definitely _wrong_ to write
  your own code that decides what is what.  In other words, it is
  unlikely that graphic-char-p will be any less at the mercy of the
  language designers than what's considered whitespace.

  In particular, I can modify what is considered whitespace through
  the reader tables, but I cannot easily modify what is considered a
  graphic character through any tables.  (I think this is really,
  really bad, however.  graphic-char-p should have a setf method.)

| Suppose the language designers observed that the character "|" was
| underused and decided to make it be whitespace so that people could
| use it as a visible but textually-ignored form of whitespace in
| programs.

  I would counter that the disparity between the environment in which
  the data was written and the one in which it is read are more likely
  to differ in what is considered whitespace if they are each left to
  decide what it is rather than to defer the decision to an authority
  like the language.

| graphic-char-p would not be changed because its definition is not
| dependent on Lisp-meaning, but rather on fontology, so my program
| wouldn't be broken.

  Yes, it would, if the "fontology" changed without your knowing
  between the writing and the reading of that self-same stream of
  characters.  The only way to resolve this issue is to let the data
  itself contain a description of what it considers whitespace, in a
  form that is defined by a language authority.  I took a short-cut on
  that route and decided to defer to the language definition directly.

| But your program would be broken because you have violated the
| abstraction and used peek-char in a case it was never intended for.
| Right now, what you say is "true", but in the terminology of Saul
| Kripke, it would not be "necessarily true".

  But you're even worse off with graphic-char-p.

| Understand: The language designers aren't likely to change anything
| at all at this point, but they could.

  Well, what can I say?  Character sets and encoding is one of my
  specialties, and I distrust the programming population's ability to
  think clearly about the meaning of "character" as opposed to "byte",
  with good reason, I might add: Few fundamental areas of computer
  science have been screwed worse than the most basic: The meaning of
  our information.  Implicit representation of the century was a minor
  thing compared to the implic representation of the character sets.

| My feeling is analogous to what you see in people who recommend not
| parsing comma-lists by doing
|  (read-from-string (format nil "(~A)" (substitute #\, #\Space comma-list)))

  But that's clearly stupid!  :)  _Perl_ people do that kind of thing.

| It's all about the very nature of abstraction.

  I fully agree.  I want an abstraction of "whitespace" that is
  consistent with the language I use.  If the input does not agree,
  then we sit down and define the input format, but this thing about
  the whitespace arose because the _Lisp_ environment decided what to
  consume and what not to consume from the input stream.  Clearly that
  is _not_ something that comes from the input language, anymore.

| Abstraction is about separating the true from the necessarily true.

  It is necessarily true that if the Lisp environment decided not to
  consume what it considers whitespace, then there is whitespace left
  whose definition is at the discretion of the Lisp environment, and
  you would be in error to assume that you knew better, assuming that
  the Lisp environment had not consumed something that was non-space,
  non-graphic-char-p.

| The C language is about blurring those differences, and relying on
| accidental implementational alignments in order to gain a few
| percent in speed or space.  Lisp is about avoiding that.

  Yes!  That's why I rely on the Lisp environment to know which
  characters _it_ did not consume.

| But for whatever reason, that's why I don't use the T argument to
| PEEK-CHAR unless I'm implementing Lisp reader functionality.

  I consider this a case of Lisp reader functionality, because we're
  reading input with the Lisp reader.  That's how the whole need for
  this initial whitespace-gobbling came up, remember?  We're not
  talking about an input stream under total programmer control, here,
  but one which might contain some "whitespace" after the expression
  that caused the function be called on the same input stream as the
  Lisp reader was gobbling or not from.

| Every time I use any character-related functionality in CL, my very
| first thought is "am I talking about the tools for manipulating
| reader functionality or the tools for  manipulating
| fonts/letters/etc. whose properties are maintained separately".

  Yep, me too.  In this case, the Lisp reader decided not to gobble
  the whitespace that was left in the input buffer between the end of
  the expression and the user input.

| I try to keep these universes very separate, since different
| political regimes control them and I expect those regimes never to
| coordinate.

  Agreed.  That's why it would be wrong to second-guess the whitespace
  gobbler in the Lisp reader to consume your particular understanding
  of what constitutes whitespace in the input _after_ the Lisp reader
  has been satisfied.

| As long as I work in only one realm or the other, I can weakly hope
| for those regimes to behave in internally consistent ways.

  But in this case, you work in _both_ realms.

| Most people would call me silly or anal for worrying that CL is ever
| going to change on this point, but there ya go...  I beat them to it.

  :)

#:Erik
-- 
  Does anyone remember where I parked Air Force One?
                                   -- George W. Bush