Subject: Re: Core ideas behind SGML and XML
From: Erik Naggum <>
Date: 30 Sep 2002 15:40:43 +0000
Newsgroups: comp.lang.lisp
Message-ID: <>

* Rob Warnock
| I'm very interested in learning what these "core ideas" are, since I
| suspect I am missing something significant (and/or obvious!)

  To someone familiar with (Common) Lisp, it is much easier to grasp, and may
  indeed appear obvious or trivial.  These concepts are central:

· declarative elements with contents (not commands).
· hierarchical element structure.
· element structure validation (to a content model).
· element structure normalization (default values).
· element identity and referencing.
· declared entities with uniform references.
· declared notations.
· (abstract) public (entity and notation) identifiers
· a syntactic layer between entity and application.
· an access layer between file systems (networked, etc) and entity

  Less central, but still important with respect to the processing model:

· separate specification of position-in-hierarchy-dependent information.
· possible specification of parent, child and left and right sibling relations.
· dynamic inheritance of processing information.
· complete mapping between document and processing element hierarchies.
· marked sections with conditional inclusion.

  Things that the SGML community think are important, but which only detract
  from the real issues and confuse people tremendously:

· attributes to elements, notations, entities.
· attribute types
· character entities.
· parameter entities.
· marked sections with special syntax for contents.
· elements with special syntax for contents.
· tag minimization and omission.
· the actual syntax.

  Neither list should be deemed exhaustive.

| I assume you're *not* talking about trivial stuff like "<tag>foo</tag>"
| syntax [which to me is just verbose way of externally representing
| (serializing) trees (though not DAGs or more complex structures), and which
| S-exprs do a much better job of (especially if you allow "#n=" & "#n#" for
| the more complex graphs)]

  The rank incompetence of SGML and XML aficionados has led people to believe
  that the syntax cannot represent anything more complex than simple trees.
  This is just as wrong as claiming that Common Lisp cannot express circular
  lists and other forms of reused objects to preserve identity throughout the
  printed form of a structured.  Using ID and IDREF is exactly analogous to
  using #n= and #n#.  (And just as they have special syntax in Common Lisp,
  they should have special syntax in an attribute-free SGML.  It is the only
  thing that /needs/ attributes in the SGML syntax.)

Erik Naggum, Oslo, Norway                 Today, the sum total of the money I
                                          would retain from the offers in the
more than 7500 Nigerian 419 scam letters received in the past 33 months would
have exceeded USD 100,000,000,000.  You can stop sending me more offers, now.