From ...
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!news.tele.dk!small.news.tele.dk!193.213.112.26!newsfeed1.ulv.nextra.no!nextra.com!uio.no!Norway.EU.net!not-for-mail
From: Erik Naggum <erik@naggum.net>
Newsgroups: comp.lang.lisp
Subject: Re: chain of transformations
Date: 25 Jul 2002 03:49:44 +0000
Organization: Naggum Software, Oslo, Norway
Lines: 174
Message-ID: <3236557784646741@naggum.net>
References: <3D3EE9D5.CBB83510@juno.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Trace: oslo-nntp.eunet.no 1027568985 14503 193.71.199.50 (25 Jul 2002 03:49:45 GMT)
X-Complaints-To: abuse@KPNQwest.no
NNTP-Posting-Date: 25 Jul 2002 03:49:45 GMT
Mail-Copies-To: never
User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2
Xref: archiver1.google.com comp.lang.lisp:36788

* Jeff Sandys
| I often implement data transformations that are modeled 
| on sequenced steps of processes: a -> b -> c -> d -> e ...
| 
| It is easy to create and debug each transformation step,
| as in (defun a2b (in) ...,  but then I end up with the 
| final combined transformation as:
| 	(e2f (d2e (c2d (b2c (a2b in)))))
| that looks kind of weird.
| 
| Is there a better lisp idiom for a chain of transformations?

  This may be a case for Common Lisp's unique features, for which it seems it
  gets harder to argue in the presence of those who appear to believe that
  syntax and convenience are entirely irrelevant, but more on that below.

  The standard idiom, functional composition inside-out read from right to
  left, may be hard to read and follow for a series of transformations, so you
  may want to express it entirely differently, like you initially wrote it:

[ in -> a2b -> b2c -> c2d -> d2e -> e2f ]

  You could do this with an ordinary macro, too:

(transform in -> a2b -> b2c -> c2d -> d2e -> e2f)

  where the symbol -> is a mere noise word or marker, but if you find yourself
  programming with such transformations, both order and notation may be shaped
  to your convenience with reader macros and supporting code to make it easier
  for you to see effortlessly that you have written the code correctly, but if
  you do not have enough of these forms in your code, the value of an unusual
  form will outweigh the convenience, and the burden of understanding the small
  changes you have made to the syntax will cause a maintainer to remove it.
  (Much like someone who uses the term "hapax legomenon" will probably use it
  only once -- and immediately regret having used up that joke for good.)

  If you look for a "Lisp idiom", I think you have preordained your solution
  and constricted your options prematurely.  If you look for a way to use Lisp
  to do precisely what you have in mind, in the way you already think, and you
  are reasonably certain that you "think Lisp", I see no problems with creating
  a mini-language more suited for the job than more long-hand notations.
  However, the fear that you may not "think Lisp" well enough just because you
  want to use an (evil?) infix syntax for a clearly defined purpose may be
  psychologically stultifying.  The key is to maintain a high and consistent
  level of aesthetics and not engage in rabid excesses or purposeless changes.
  For instance, when I had to deal with C++ a long time ago, the desire to
  redesign the language was overpowering and actually got in the way of using
  the language.  Similarly, extremely tasteless stunts in Common Lisp would
  cause other programmers to receive a constant stream of SIGWTFs while reading
  your code and that would get in the way of business.  (Much like someone who
  decides to write his own translation of the Bible, say, and quotes verses in
  an unusual form that causes those who thought they knew them to get upset
  instead of nodding knowingly to reams of archaic words ending in "eth".)

  Having read the meandering thread on Lisp's unique features, I am loth to
  conclude that some people consider syntax and convenience entirely irrelevant
  and therefore simply do not get the point: What we are more likely to do is
  closely related to the effort is required to do it -- some things are simply
  not done because the complexity or effort exceeds a threshold (which should
  not be interpreted as laziness but as economy).  For instance, if you had to
  go through a checklist of steps to be executed with precise timing to yield
  the desired results every time you needed something essential to your life,
  how many times would you repeat it before you went and invented something to
  help you achieve the results with less effort -- like, say, a microwave oven
  and TV dinners?  We know that some people actually repeat complex chains of
  steps tens, if not hundreds, of thousands of times and that this has been
  going on for hundreds, if not thousands, of generations with no change, so
  there is some element of the elusive "human nature" that clearly lets some
  people feel comfortable with repeating repetitive tasks endlessly and regard
  change as anathema to comfort.  I would argue that Common Lisp is for people
  who are _not_ like that.  Java man evidently enjoyed hard labor with the same
  primitive tools for a periods of time indistinguishable from eons, but Homo
  sapiens invented Leatherman tools and Common Lisp.

  There is, however, little point in a "programmable programming language" if
  you never program it (but even less if you feel you have to make some local
  changes just to be a member in good standing of the elite users thereof).
  The value of programmability becomes visible when your needs are in flux,
  such as when the demands are actually unknown at the outset.  What mortals
  call "applications" are usually just one step away from the programming
  language -- the entire application is written in the same language, albeit
  with non-linguistic "abstractions".  To a Common Lisp programmer inspired to
  build the language bottom-up while he solves the problem top-down, the
  application his users see is more like a meta-application -- an application
  of the application of the programmable programming language.  This is
  sometimes called "domain-specific" or "fourth-generation" languages by people
  who fail to understand that you do not have to build an entire development
  environment just because you want a new way to write "struct".

  Put another way, if you have only one statue in mind and you know exactly
  what it should look like, feel free to carve it out of granite right away
  (and to make that more efficient), but if you have to experiment until you
  get it right, you would probably find Play-Doh or clay or even a plate of
  mashed potatoes more convenient than throwing away 95%-finished granite
  statues every decade.  As you find experimenting more forgiving of errors,
  you would also automate the granite carving process, just as we have done
  with compilers and development environments over the years.

  There is nonetheless something to be said for stability.  (Much like someone
  who experiments with different keyboard layouts is expected to become more
  efficient rather than spend all his time changing it to become more efficient
  in some vaguely defined "future", such as after all the boring tasks have
  been completed by others.)  The key to successful use of Common Lisp's unique
  features is not to be led astray by the plethora of options, but to shape the
  language according to actual needs.  There is probably no substitute for long
  experience and painfully acquired wisdom in this area, but one has to be
  aware of the inherent dangers in using a new and improved syntax -- if you
  change your mind, leaving behind relics of the past is unacceptable, and you
  therefore need a mechanism to update uses of such features -- just like those
  who think that SGML or XML are good ideas for long-term document storage will
  find that changing your mind becomes exponentially more expensive as the
  inherent deficiencies of those languages make it nigh intractable to produce
  chained transformations for documents conforming to one DTD to conforming to
  each its next revision.  Compilers may well deal with source files that use
  different versions of the customized, but humans are likely to want to rely
  on their memory, lest they never advance beyond the hunt-and-peck mode of
  typing on keyboards, either.  This, incidentally, is another useful feature
  of Common Lisp: internally, the source code ends up as manipulable data in
  the language itself.  A program that reads and updates source files as you
  get one bright syntactic idea after another is eminently doable, lending
  itself to achieving stability through malleability.

  But To get back to your question -- other than the possibly fancy option, I
  might write the steps out thus:

(let* ((in (a2b in))
       (in (b2c in))
       (in (c2d in))
       (in (d2e in))
       (in (e2f in)))
  (whatever in))

  Note that _apparently_ reusing the same variable in let* may be misleading,
  as it is not actually reused, but creating new bindings.  This would be a
  good idea if the abstract "type" of the object handed from transformation to
  transformation does not change (which is not the same as the representational
  type in the language, naturally).  I tend to use this form when there are
  multiple arguments to each call (and the chained input is not spottable from
  afar) and it may be hard to understand precisely what the returned value is.
  Adding a name to it may help in reading and maintaining the code.  In that
  case, you would not reuse the variable name any more than you would call your
  functions "x2y".  I prefer to use a variable named after the type when there
  are separate lookup-functions, for instance.  A fairly unique feature of
  Common Lisp lets me get away with a variable named the same as a class --
  instead of having to use articles like "a" or "the" in front of them or some
  other shenanigans -- reusing a name comes very natural in human languages and
  will, when properly used, work to reduce the cognitive load when reading.

  Tangentially speaking of "foo2bar"-functions, I have come to hate that way to
  name functions.  It is reminiscent of the stupid redundancy in Java where you
  have to write SomeComplexTypeName foo = new SomeComplexTypeName (bar); -- and
  thank St. GNUcius for dynamic abbreviation in Emacs -- you have to keep the
  function call in sync with types of _two_ variables and probably have to
  write the type names out in full several times.  I prefer naming functions
  based on _one_ aspect and then to use generic functions or designators to
  ensure that it works reasonably for reasonably input types.

  To sum up, I think the notation/idiom you end up choosing should be dictated
  by your expected need to type, verify, and read the code written using it.
  There is nothing wrong, in my view, with a thorny mess warning readers not to
  trespass if modifying it would actually endanger the code.  (This is not to
  be mistaken for the "it was hard to write, it should be hard to read" school
  of writing, though.)  I prefer a sort of Huffman code for syntactic features
  -- the most frequently used gets the shortest and most compact syntax.  This
  is naturally an empirical issue, and random beliefs in frequency of use are
  usually off by several orders of magnitude because our cognitive apparatus
  appears to be wired to associate importance with cogntive strain and emotions
  like pain, even though they are usually inversely related.

-- 
Erik Naggum, Oslo, Norway

Act from reason, and failure makes you rethink and study harder.
Act from faith, and failure makes you blame someone and push harder.