Subject: Re: Pause for keystroke
From: Erik Naggum <erik@naggum.net>
Date: Sat, 06 Oct 2001 13:19:20 GMT
Newsgroups: comp.lang.lisp
Message-ID: <3211363159251860@naggum.net>

* Andras Simon
| And what if you want to read (as opposed to read-line) something, but
| newline is OK, too? I'm asking this because I'm slightly irritated by
| having to type something to CMUCL's toplevel in order to get back the
| prompt if the previous output wasn't terminated by a newline.

  This is not a trivial problem, unfortunately.  The most appropriate way
  to read Common Lisp input from an untrusted source is to obey the rules
  of the source, and if that is lines instead of expressions, you need to
  work with prompts and various forms of input editing.  E.g., it would be
  nice if the prompt could contain an indication of unclosed delimiters, as
  well as some way to discard the unfinished form.  The only time it is
  appropriate to call the function read directly is when you "know" that
  the source will contain a Common Lisp expression, or, in other words,
  when you are prepared to deal with errors coming from violating that
  knowledge.  (This is why it is a very good idea to use Emacs interfaces
  to Common Lisp environments.)

  E.g., if you want to read interactive user input, I think the appropriate
  way to do this is to collect a syntactically valid form first, _then_
  process it.  Many input processors tend to be built around the assumption
  that it is easier to backtrack than to validate before processing.  Much
  of the parsing literature that exists unquestioningly _assumes_ that the
  _only_ way to get any match at all is to confuse the validation and
  parsing processes.  This is in part caused by the largely unfounded
  belief in context-free grammars, which has many strongly appealing
  theoretical aspects, but also a large number of negative human factors
  that detract from readability and processability.  (The influence of
  these bad theories on the retarded notion of "ambiguity" in SGML/XML/etc
  has caused a large increase in the cost of designing document types and
  applications, not the least that of educating/training/hurting developers
  and users alike so they stop wanting something completely reasonable.)

  Part of the problem of this mode of thinking is that most streams are
  very naively implemented one-pass structures.  E.g., if you want to
  collect a line of input, you copy characters from the stream (buffer) to
  some (other) buffer while looking for a line terminator character.  If
  you could instead push a "mark" on the stream (buffer), tell the stream
  to skip characters until a line terminator was seen, and return with it,
  you could extract the portion of the buffer from the mark until the
  current position, if you needed to: it should also be possible to refer
  to the characters in the stream buffer via a displaced array.  Naturally,
  the simple-minded single-buffer approach to buffering input and output is
  also at fault.  As long as you refill the same buffer with new data, you
  cannot work with buffer marks.  (Neither can you ask the operating system
  to kindly pre-fill the next buffer while you are doing something else, so
  you end up with a _guessing_ operating system and completely unnecessary
  delays at the buffer edges.)  Designing an SGML entity manager and parser
  around these ideas (in 1993-1994) caused a dramatic 5-fold speed increase
  over the naive C stream implementation.

  So much interesting work remains in the Common Lisp reader if one wants
  to support a better interactive environment, and that includes _much_
  better error recovery when reading Common Lisp files that have not been
  produced by a competent expression-oriented environment like Emacs with
  an intelligent user.  The common way of de-coupling the input processing
  from the "terminal" also leaves much to be desired.  Those who remember
  TOPS-20's command line processor will know what I mean and miss, but
  others may need to have several layers of blinders removed after only
  having been exposed to the ultra-primitive Microsoft command line and
  only somewhat better Unix command line, especially if they think that GNU
  readline is an improvement.  A typewriter remain a typewriter no matter
  how much chrome you add to it, and Unix even has an error message to tell
  you that you have violated its assumptions: "ENOTTY -- Not a typewriter".
  Unfortunately, nigh the whole world is now duped into thinking that silly
  fill-in forms on web pages is the way to do user interfaces.  It is not
  unlikely that this is a sort of improvement over the typewriter, but that
  is about all there is to it.

///