Subject: Re: data structure for markup text
From: Erik Naggum <>
Date: 1999/06/24
Newsgroups: comp.lang.scheme,comp.text.sgml,comp.text.xml,comp.lang.lisp
Message-ID: <>

* William F. Hammond
| Many of the "problems" with SGML arise from the fact that so few good
| examples of SGML processors ... are easily accessible.

  the problems with SGML are conceptual.  practical problems have practical
  solutions and are uninteresting from a language design point of view.
  some conceptual problems have practical solutions, and while interesting
  for an implementor, are also uninteresting insofar as they don't have
  unreasonably high costs.  the rest of the conceptual problems must have
  conceptual solutions, too -- and _those_ are the interesting ones from a
  language design point of view.

  my rule of thumb: if "more money" is the answer, you have an uinteresting
  practical problem -- conversely, the measure of success of conceptually
  good solution is that solving the problem consumes less resources from
  then on.

| It captures *structural content*.

  it's unclear what you're trying to tell me that you think I don't know.

| SGML is only a framework.

  indeed it was intended as such, but it is its capacity as framework that
  I have been lamenting.

| There is almost always more than one way to proceed at the design stage.
| The absence of a canonical way to proceed can create the appearance of
| complexity.  Such a human perception is only psychological.

  while I dismiss practical problems out of hand -- smart people will solve
  them sooner or later -- respecting human nature is what good design is
  all about, and the more you respect it, the better the design is.

| If one only wants a single presentation, forget SGML.

  this is the most dangerous statement you can make if you want to destroy
  SGML completely.  it's like saying a programming language is only good
  for the really complex problems -- it will lose the competition for the
  people who believe their problems are simple, or who need simple steps
  from problem to solution.  (hint: such people abound.)

| The lisp-like structure of the earlier posting is very interesting and
| worthwhile.  In fact, I prefer that markup style to SGML tagging, which
| my eyes do not like to look at.

  I'm glad to hear it was so intuitively appealing.

| (Hence, my GELLMU project that involves still another markup style that
| is LaTeX-like.)

  one of the problems that come with making text the primary syntactic
  element is that you have to invent so much black magic to keep markup
  distinct from text.  I prefer a much simpler way to deal with this, that
  used in programming languages: delimit the data, not the code.  in
  particular, Common Lisp's very simple syntax: a string is delimited by
  double quotes; a backslash precedes a literal character inside a string.
  no black magic like C, and absolute predictability both reading and
  writing the strings.  (note that one of the uninteresting practical
  problems of SGML is that SGML's syntax differs according to the SGML
  declaration, which makes some character sequences magic and others not --
  the only way you can _really_ be safe is using character entites for
  every character.)

* Erik Naggum
| SGML is just as bad as any other static structure in that latter regard.

* William F. Hammond
| But also just as good.

  hello?  the whole point of my article was that static structures are
  insufficient for any publishing problem worth solving.

| As the years go by, the test of a markup language created today will be
| its amenability to the automatic processing of legacy documents into the
| formats of the future.

  I actually agree.  SGML is uniquely slated to flunk this test.  if you
  don't agree, I expect to see your solution to the problem of updating a
  document automatically when its DTD changes.  if this is "impractical",
  take a look at SQL and the tools created for it: the unique strength of
  that language is that you can dynamically improve your database without
  having to dump and reload it, which is what people had to do prior to SQL
  and its quiet revolution.  (yes, that phrase was first used about SQL.)
  the more complex structures become, the more people need to be able to
  change them as they learn more about them.  SGML is the worst possible
  language in which to do just that.

  because SGML/*ML does not support structure rewriting, it cannot survive
  any serious amount of change.  all "macro languages" have such rewriting
  -- it's what "macro" is all about -- but SGML decided to discard this
  aspect of being a programming language.  without it, people will have to
  write tons and tons of code to deal with special cases, build front-ends
  that deal with stuff SGML doesn't, etc, etc.  it is no coincident that
  there are lots of "scripting languages" that produce HTML out there, just
  as it is no coincidence that tools that process SGML come with their very
  own languages, way more arcane than anything programming language people
  could dream up.

  wouldn't you just _love_ to have a Turing complete markup language, with
  a nice syntax that both humans and machines could read with ease, which
  allowed you to do structure rewriting _in_ the language?  the only way
  you can do that is to fully realize that data and code are the _same_.
  in so doing, you realize that creating short-term convenience barriers
  between parts of an inextricably linked whole is counterproductive in
  non-immediate terms: the net effect can only be to force people on both
  sides of each barrier to reinvent the rest of the whole on their own,
  which is a phenomenal waste of time, regardless of what they manage to do
  that is productive and useful, _unless_ your only concern is the short
  term, in which case such waste has no bearing on the evaluation.

@1999-07-22T00:37:33Z -- pi billion seconds since the turn of the century