Subject: Re: Name for the set of characters legal in identifiers
From: Erik Naggum <erik@naggum.no>
Date: 14 Jan 2004 08:22:42 +0000
Newsgroups: comp.lang.lisp
Message-ID: <3283057362064279KL2065E@naggum.no>

* Russell Wallace
| Thanks for the explanation - okay, so basically any character _can_
| be part of a symbol... fair enough... my question is really about
| the English terminology, though.

  The terminology is really pretty simple, but you have to look at it
  from the right angle.  In languages that require identifiers to be
  made up of particular characters, there is obviously a name for the
  character set, but in a language that goes out of its way to make it
  possible to use absolutely any character you want, there are only
  names for those characters that need special treatment to become
  part of a symbol name because their "normal" function is not to.

| Whereas if you write...
| 
|  (defun )(')( ...)
| 
| That won't work; (, ) and ' are "punctuation" (?) and normally
| recognized by the reader as special characters.

  Well, they are known as "macro characters".  The important thing is
  that the set of macro characters is not defined by the language, but
  by the readtable in effect when the Common Lisp reader processes
  your source.  There is a standard readtable, however, and one would
  have to say "unescaped terminating macro characters in the standard
  readtable" or another phrasing that tries to hide the obvious anal
  retentiveness to really speak about the characters that will not be
  part of a symbol name unless you have changed the rules.  There is
  nothing particularly special about any of these macro characters.
  There are some restrictions on what the readtable can do and how the
  reader collects characters into symbol names.  If you really insist,
  calling them "constituent characters" will help, but realize that
  this property is a result of falling through every other test --
  unless it is escaped, in which case it wins its constituency right
  away.  (There's an awful pun waiting to happen here, about Iowa, but
  I'll ignore the temptation.)

| (I'm talking about the normal case, not what you can persuade the
| reader, interner or whatever to do if you try hard enough :))

  While this may seem reasonable from the angle you chose to look at
  this problem, it is the a priori reasonability of the position that
  has produced your problem.  It is in fact unreasonable to approach
  Common Lisp from this angle.  The problem does not exist.  This

  (defun |)(')(| ...)

  is in fact fully valid Common Lisp code.  You cannot define away the
  solution to the problem and insist that you still have a problem in
  need of an answer.

| So there's "whitespace", "punctuation" and... what's the third
| category called? Not "alphanumeric"... "constituent characters"?

  I have to zoom out and ask you what you would do with the elusive
  name for this category.  If I guess correctly at your intentions, I
  would perhaps have said that "any character can be part of a symbol
  name, but most macro characters need to be escaped to prevent them
  from having their macro function".  (The important exception is #,
  the only non-terminating macro character in the standard readtable,
  meaning that #xF will be interpreted as hexadecimal number, but F#x
  is a three-character-long symbol name with a # in it.)

  Unless you have a simple need that can be resolved by a nice, vague
  explanation that only informs your reader that Common Lisp is a lot
  different from languages that require particular characters in the
  names of identifiers/symbols, I think Chapter 23 in the standard, on
  the Common Lisp Reader, would be a really good suggestion right now.

  Yeah, I'm back allright, with undesirably high levels of precision,
  scaring away frail newbies from day one.  Maybe I'll go hibernate.

-- 
Erik Naggum | Oslo, Norway

Act from reason, and failure makes you rethink and study harder.
Act from faith, and failure makes you blame someone and push harder.