Subject: Re: XML and lisp
From: Erik Naggum <erik@naggum.net>
Date: Sat, 25 Aug 2001 12:08:48 GMT
Newsgroups: comp.lang.lisp
Message-ID: <3207730126705119@naggum.net>

* Barry Fishman <barry_fishman@acm.org>
> I looked again, and you incantations did not work.  Attributes still
> seem to be in the language.

  Sigh.

> I agree that when XML is used as a data definition they are "completely
> arbitrary" and make a syntactic separation which is destructive.  I,
> personally, just avoid using them when I have control of the XML I use to
> define data.  But I can't ignor them or re-format them, when I need to
> generate XML which someone or some standard defined to use them.  That
> battle belongs in the XML standards committees, and I am afraid its a bit
> late to change their minds.

  How you work with XML is not defined by those standards bodies.  What
  your _internal_ representation of XML looks like is not defined by those
  standards bodies.  One of the fundamental properties of Lisp is that we
  have a very nice and well-defined mapping between external and internal
  representation for most of our object types.  There is no well-defined
  mapping between XML syntax and internal representation.  Lots of ways are
  equally valid.  Insisting on only some of them is counter-productive.

> If I just treat attributes as subordinate elements, I lose the ability to
> simply translate from lisp into XML.

  You have made up your mind about this, so I shall not try to convince you
  of the errors of your ways.  People who are dead set on their ways should
  be left alone, mostly because the get cranky when faced with alternatives.

> In other news articles you seem to suggest that you use information
> outside the lisp representation to make that determination.

  No, you do not understand, and that is because you do not even try.

> This means that my tools would require priori knowledge, which I feel a
> simple lisp->XML (non-interpretating) translator should not need.

  I see that you have to be very hard and fast on how you represent your
  information.  This is your choice.  I wish you would recognize it as a
  choice, and not try to impose a very specific view on the reality that is
  far more flexible and adaptable than you have shown to believe it to be.

> I don't think lisp->XML translators should have constraints that XML
> parsers don't have.

  Well, that is another choice you have made.  Other people, other choices.

> In code which interprets the lispified XML, I know what the grammar is,
> so can't I (at that time) bury any abstraction issues in the access
> methods?

  What does it matter to your access whether something is an attribute or a
  sub-element?  Why do you need to retain the distinction internally?

> I admit I don't fully understand the abstraction benefits with which you
> are concerned.

  I appreciate that you state this, because you certainly have not.

> I've been overwelmed in tracking all the XML languages which are being
> defined.

  Yes, overwhelmed by bad design, most people's brain shut down and they
  refuse to deal with a massive simplification because it threatens to be
  as painful as dealing with the complexity they have barely survived.

> I was hoping that being able to map them into lisp syntax would help
> avoid being buried in XML's confusing syntax.

  That is my idea.  I am sorry for you that you have to define away the
  solution to your problem by insisting on a trivial one-to-one mapping of
  conceptual elements that effectively block your own conceptualization.

> When looking at them in a lisp syntax, thing can become clearer (and seem
> less innovative).

  How very true.

> I don't agree that the distinction between attributes and entities is
> always arbitrary.

  Attribute and entities are very different concepts and distinction
  between them is of fundamental importance.  I fail to see how you think I
  have made any claims about their relationship, however.  I am talking
  about _elements_.

> SGML does stands for Simple Graphical *Markup* Language,

  It stands for Standard Generalized Markup Language, actually.  They key
  to understanding the name is that "generalized markup" is something more
  than mere markup.  SGML has aspirations beyond simply marking up text.

> and in a markup language, I think it is important to distinguish the text
> of a document from it markup.

  I think I already said that.

> Multiple translators may be used, and they should not need to be kept up
> to date on what attributes are used in the other translators.

  Your value judgments are your choice.  I happen to disagree with them.
  If you try to deny me this, please realize that I do not care at all.

> In an expression like:
> 
> <header1><italic>Wow</italic>, this is difficult.</header1> 
> 
> or as lisp (which I think is more readable):
> (header1 (italic "Wow") " this is difficult")
> 
> it isn't clear whether "Wow" is text or the value of an attribute
> unless you have prior knowledge of whether `italic` is a attribute in
> the context of a header1 directive.

  Well, first off: You _have_ that prior knowledge.  Your application will
  actually need to know what to do with it whether it is an attribute or a
  sub-element.  If your application does not know what to do with it, I
  fail to see how whether it is an attribute or an element can matter to
  you.  If you _do_ know what to do with it, how does it matter to you
  whether it came from an attribute value or a sub-element?

> So here the distinction is simple, clear, and useful.

  It is arbitrary.

> This is still important for things like xhtml -- and probably docbook,
> whose standard I have not yet assimilated.

  No, it is fundamentally unimportant.  Please try to accept this premise
  for the sake of discussion, and see if something you believe falls out
  and shows itself to you as more important than your simple protestations.

> In my previous message I suggested that:
>    <header1 italic="Wow"> this is difficult</header1>
> 
> become:
>    (header1 ((italic "Wow")) " this is difficult")
> 
> With mimimal (but I admit real) damage to the syntax.

  Keeping the distinction between attributes and content is keeping you
  from realizing how simple and efficiently you can deal with XML data.
  But that is your choice.  I fully expect that loads of people who have
  fused their brains shut and have fully "integrated" the false dichotomy
  of attributes and contents will never be able to unfuse it and open up to
  a very simple realization that it has absolutely no bearing on anything
  _other_ than the specific syntax in SGML/XML whether something is an
  attribute or an element.

  Those who grasp the concepts involved, will see that attributes are just
  another form of contents.  Those who do not grasp the concepts involved,
  will think that attributes are different from contents because they have
  been given syntactically different expression.  But it is always the
  syntax that follows the function.  Someone believed that meta-information
  should be fundamentally different from information.  Someone believed
  that the contents of elements should be text that wound up in the final
  document on the printed page and the values of attributes should not, but
  should only influence the processing of the information.  This worked
  only as long as SGML was used as a markup language for documents and had
  no aspirations towards being an abstract structuring syntax.  When it
  came to use it as a more abstract syntax, there _is_ no inherent quality
  that determines whether some value ends up displayed or not.  That has to
  be supplied by the software that processes the information, which is
  precisely prior knowledge of the structure and its meaning.

///