Re: ObjectWeb context

David McCusker (
Thu, 20 Nov 1997 19:00:15 -0700

Mark Baker wrote:
> [ responding to Ron Resnick's request for "DOM for dummies" ]
> Okey doke. I apologize for not giving a little context on my ObjectWeb
> opinions. [ snip ]

I've learned slightly more about DOM lately in the context of Netscape
internal design rhetoric describing how it applies to creating interfaces
for the underlying data models that might be involved.

> The XML/Beans "proposal" I mentioned is entirely encapsulated within some
> emails I've sent to Javasoft and Javasoft mailing lists. I won't bother
> with the full text of the notes, but the gist is simply this; "Use XML
> instead of the current Java serialization format".

That's a good idea. Likely Javasoft will only resist due to inertia and
some understandable degree of NIH (not invented here) which is absolutely
universal, not matter how much any organization claims to be free of it.

> I didn't propose any specific way to use XML, though I did give an
> example (the Netscape JavaScript Beans stuff I've mentioned here).

Proposing a specific usage might provoke useless railing against the
example even though the basic idea is quite good.

> Figuring out *how* to use XML is certainly the hard part, and though
> I've got some ideas (which I'll get into), I doubt I'm the best person
> for the job. Besides, this stuff isn't in my job description. 8-(

I think it's hard because it's effectively a user interface design issue
for the folks who will actively use any specific XML serialization. And
user interface design involves vast amounts of struggle and argument. :-)

> So, what's all this DOM, XML, RDF stuff about?
> As Ron says, XML is some sort of IDL superset. That's true. But why?
> Because an IDL file is a structured, self-describing, serialized version
> of an interface - a document. We all know the IDL structure; modules
> contain interfaces, and interfaces contain methods (just to oversimplify
> for a moment). We can parse an IDL file knowing that structure.

Yes, XML is good to more than replace the purpose of IDL. XML provides
a uniform syntax for describing metainformation that can easily replace
the comparatively hodge podge approach of more specific kinds of language
like IDL, which use a fixed set of keywords and more complex syntax.

(Recently I noticed the syntactic parallel between XML begin and end tags
and Lisp begin and end parentheses. It is more redundant and therefore
less complex than a more sparse syntax that just barely manages to remain
unambiguous. It enables simple parsers; it feels like tagged Lisp. More
importantly, redundancy is necessary to enable self-healing systems.)

> Of course, IDL structure is implicit; it's not part of the document
> itself. It's only by being told that it's an IDL file that be can then
> look at it and say, "Hey, that's valid IDL" or "Nope, that's invalid
> IDL" based on its documented structure inthe CORBA/ISO specs. IDL itself
> isn't SGML based (no tags, elements, etc.), but it is self-describing
> (mostly, see below), so one could easily construct a DTD to capture all
> the information already there.

Yes, IDL is not intended to be self descriptive, so systems like XML
that are intended to make reflection and metainformation more first class
have the advantage of clarity at the typical cost of relative conciseness.

> As an XML instance, IDL might look something like this;
> <module>MarkStuff
> <interface>IMark
> <exception>badInput</exception>
> <method throws="badInput">giveMeOutput
> <argument mod="in" type="string">inputVar</argument></method>
> </interface>
> </module>

That seems like a fine example. Compared to IDL, it clearly suggests what
code might do in response to parsing this information, because the purpose
is more explicit. This should be much easier for less technical folks to
use and understand, because it requires less implicit knowledge. The need
for deep implicit knowledge puts off folks first learning traditional code.

> Now, I'm still figuring out the nuances of DTD design, so no flames
> please! 8-) This DTD isn't very good since some of the content of the
> interface is hidden in the tags, and the rule of thumb is supposed to
> be that anything that is content should be what is being marked up, not
> hidden in the tags. Just take it as a (bad) example. 8-(

I don't like that rule of thumb because it seems an implicit approach to
dealing with a hidden agenda in a system otherwise more inclined to be
explicit by convention. I think the hidden agenda is how to reasonably
display the content when one does not understand the tags and would like
to hide them. But it seems feasible to have DTDs define this behavior.

The reason I don't like the rule is because it introduces a gratuitous
constraint that makes things harder in the name of a specific goal that
might not be relevant in many contexts. So it is a cost that taxes all
applications in which the goal of raw presentation does not matter.

> But hopefully it makes my point.
> DOM is the "Document Object Model". It specifies an *interface* to a
> serialized structured stream/document/whatever.

Yes, it's an interface to the model, not the raw model itself. (One
should get used to dealing with interfaces anyway, and not expect to have
access to some approximation of a "bare metal" implementation level.)

> So, given *any* structured byte stream (Fe, Bento, Java serialization),
> I can construct a DOM compliant implementation that will allow me to
> parse that stream. Once again, DOM does not require you to physically
> store anything in XML.

Yes, indeed. That's why I expect folks to sometimes use an Fe (IronDoc)
encoding under the covers of DOM, even when sometimes the content source
was originally in XML, because it might be converted to Fe for more
efficient use under complex operations on large data sets. Of course it
can be converted back to XML again without loss when necessary.

> [Frank's recent email notes that interfaces are models. I reckon that's
> right, since "models" are just abstractions, and abstractions can happen
> at any level. Where's David McCusker when you need him? 8-) ]

Yes, that's right. This is the kind of abstract problem with the meaning
of terms that my theory of context was intended to clarify. Being a
model is a role in opposition to other roles, like a view, in a specific
context. A model is generally a viewee with respect to one or more
viewers with which the model only deals with anonymously by convention.

The view and model relationship in computing systems tends to generally
correspond the notion of meta levels in other contexts. A model is a
system of information, and a view knows about the model and can present
or operate on the model. A view1 can be a model with respect to another
view2 which presents metainformation about view1. Just a meta level.

This partly explains the reason why I don't see a problem with using a
"document" as an object that acts like a view on content somewhere else.
But this is hard to reason about for folks who don't like a many-context
world, who also like absolute rather than relative terms. Such a
preference limits reasoning about problems to a painfully narrow scope.

> Now, XML is *one* type of structured stream format. It just so
> happens, because of the size of the Web, that it's going to be *the*
> standard one.

This seems quite likely to me. There needs to be a foundation of low
complexity that is difficult to corrupt upon which to build more complex
systems that feed upon the supporting infrastructure. This suggests an
iceberg metaphor I don't want to make since I recall there is some
commercial product using that term. :-)

> How does this relate to DOM? Well, an XML doc is like Fe, Bento,
> or Java serialization; it's just another storage model. DOM can be
> used to parse it too.

Yes, and the fact that DOM has not committed to a storage model makes
it stronger because it is not dependent on the weaknesses of any given
model being used at a particular moment.

> This even holds for IDL in a text file, since IDL does provide markup
> ("module", "attribute", etc. are markup tags, just without the SGML-ish
> '<' and '>' - the only exception is that there's no "method"
> keyword/tag). One could build a DOM compliant implementation of an
> IDL parser.

That seems a very astute comparison of equivalency where it is present.

> RDF is no big deal. It's just an XML DTD that can be used to do trader-like
> functionality on the web. Not that this isn't important and wonderful,
> but for the purposes of this post, its purpose doesn't matter.

Yes, when RDF occurs in XML files, it is effectively a particular
standard DTD that describes specific concepts that RDF would prefer
to reason about. The purpose of RDF is to standardize some forms of
reasoning about content and relationships between content. This purpose
matters less when one is not concerned with specific standards.

I had a very brief and informal talk with R.V.Guha one day about IronDoc,
where I told him what role I saw IronDoc filling in contexts where RDF
might be used. Basically I said it was a storage and encoding mechanism
that does not care about schemas or application interface semantics. So
RDF can still be the main interface an application sees to get content.

This seemed to satisfy Guha because he was less concerned about how data
is stored under the covers, as long as apps can see a standard API.

> [I'm just going to rephrase my previous RDF note in the context of this
> tutorial-like note]
> What does matter about RDF for the purposes of this post is that it's
> another example of the use of markup. Define a DTD that allows you to
> clearly express the structure of the semantics you desire, and presto,
> you've got a new service component you can hang off the web.

Yes, a self-describing world allows pipes to connect to each other
dynamically, and this is about as effective as having a completely
static world based on standard pipes in terms of promoting connections,
but has the advantage of allowing evolution and adaptation.

> Actually, if you read the RDF draft, you'll note the comment that a DTD
> isn't necessary to use RDF. This isn't a big deal, as you'll see below.
> RDF requires that all documents wishing to participate as official "web
> resources", document their semantics with its format. These semantics
> are themselves documents which persist.

Somehow this makes me suspect a separate RDF document wants to describe
the content of one of my IronDoc documents so that RDF could read the
format instead of IronDoc. This is not feasible since it could be
very impractical to describe large btrees this way, to be accessed by
code that does not understand the btrees. I hope RDF is not so strict.

> [ snipped extended Java Trader/ Bean-ified RDF example ]

I had little useful new reasoning to offer about the bean material.

David McCusker, IronDoc weekends, Netscape xp client mail/news weekdays
Values have meaning only against the context of a set of relationships.