ObjectWeb context

Mark Baker (markb@iosphere.net)
Thu, 20 Nov 1997 11:03:37 -0500 (EST)


On Thu, 20 Nov 1997, Ron Resnick wrote:
> Mark- can we *please* have a tutorial (on-list) about DOM, XML, RDF
> etc.?
> I, for one, am not afraid to admit you've gone way past me on this, and
> I'm totally
> lost. I can get far enough to see how XML is a sort of superset to IDL,
> but I admit
> to bafflement on your discussions with Frank Manola. How about DOM for
> dummies?

Okey doke. I apologize for not giving a little context on my ObjectWeb
opinions.

I'm sending this to dist-obj *and* FoRK. I'm hoping some of the FoRK
web-heads can comment (I really wish Adam would resurrect).

For those of you not familiar with these mailing lists, check out;

http://www.infospheres.caltech.edu/mailing_lists/dist-obj/distobjgroup.html
http://xent.ics.uci.edu/FoRK-archive

The XML/Beans "proposal" I mentioned is entirely encapsulated within some
emails I've sent to Javasoft and Javasoft mailing lists. I won't bother
with the full text of the notes, but the gist is simply this; "Use XML
instead of the current Java serialization format". I didn't propose any
specific way to use XML, though I did give an example (the Netscape
JavaScript Beans stuff I've mentioned here). Figuring out *how* to use
XML is certainly the hard part, and though I've got some ideas (which
I'll get into), I doubt I'm the best person for the job. Besides, this
stuff isn't in my job description. 8-(

So, what's all this DOM, XML, RDF stuff about?

As Ron says, XML is some sort of IDL superset. That's true. But why?
Because an IDL file is a structured, self-describing, serialized version
of an interface - a document. We all know the IDL structure; modules
contain interfaces, and interfaces contain methods (just to oversimplify
for a moment). We can parse an IDL file knowing that structure.

Of course, IDL structure is implicit; it's not part of the document
itself. It's only by being told that it's an IDL file that be can then
look at it and say, "Hey, that's valid IDL" or "Nope, that's invalid
IDL" based on its documented structure inthe CORBA/ISO specs. IDL itself
isn't SGML based (no tags, elements, etc.), but it is self-describing
(mostly, see below), so one could easily construct a DTD to capture all
the information already there.

As an XML instance, IDL might look something like this;

<module>MarkStuff
<interface>IMark
<exception>badInput</exception>
<method throws="badInput">giveMeOutput
<argument mod="in" type="string">inputVar</argument></method>
</interface>
</module>

Now, I'm still figuring out the nuances of DTD design, so no flames please!
8-) This DTD isn't very good since some of the content of the interface
is hidden in the tags, and the rule of thumb is supposed to be that
anything that is content should be what is being marked up, not hidden
in the tags. Just take it as a (bad) example. 8-(

But hopefully it makes my point.

DOM is the "Document Object Model". It specifies an *interface* to a
serialized structured stream/document/whatever. So, given *any* structured
byte stream (Fe, Bento, Java serialization), I can construct a DOM
compliant implementation that will allow me to parse that stream. Once
again, DOM does not require you to physically store anything in XML.

[Frank's recent email notes that interfaces are models. I reckon that's
right, since "models" are just abstractions, and abstractions can happen
at any level. Where's David McCusker when you need him? 8-) ]

Now, XML is *one* type of structured stream format. It just so happens,
because of the size of the Web, that it's going to be *the* standard
one. How does this relate to DOM? Well, an XML doc is like Fe, Bento,
or Java serialization; it's just another storage model. DOM can be used
to parse it too.

This even holds for IDL in a text file, since IDL does provide markup
("module", "attribute", etc. are markup tags, just without the SGML-ish
'<' and '>' - the only exception is that there's no "method"
keyword/tag). One could build a DOM compliant implementation of an IDL
parser.

RDF is no big deal. It's just an XML DTD that can be used to do trader-like
functionality on the web. Not that this isn't important and wonderful,
but for the purposes of this post, its purpose doesn't matter.

[I'm just going to rephrase my previous RDF note in the context of this
tutorial-like note]

What does matter about RDF for the purposes of this post is that it's
another example of the use of markup. Define a DTD that allows you to
clearly express the structure of the semantics you desire, and presto,
you've got a new service component you can hang off the web.

Actually, if you read the RDF draft, you'll note the comment that a DTD
isn't necessary to use RDF. This isn't a big deal, as you'll see below.

RDF requires that all documents wishing to participate as official "web
resources", document their semantics with its format. These semantics are
themselves documents which persist.

Contrast this with a Java Trader. Beans would need to describe their
semantics using the trading properties, and then presumably persist those
with serialization. But, because every Java object is a Bean, this
properties file is a Bean. And since it's a Bean, it would be serialized
to XML.

Now, it gets a bit tricky here. I found myself asking the question, "Ok,
but how's the JVM supposed to know that *this* Bean is supposed to go to
RDF" (or whatever other XML DTD you might want to serialize to)? The
answer is another ObjectWeb parallel - and a whopper it is, IMHO.

The structure represented by the DTD should be translated into a set of
structured Java Glasgow Beans. For example, if you check out the current
draft of the RDF spec, http://www.w3.org/TR/WD-rdf-syntax/, it gives
this example for a simple assertion;

<?namespace href="http://docs.r.us.com/bibliography-info" as="bib"?>
<?namespace href="http://www.w3.org/schemas/rdf-schema" as="RDF"?>
<RDF:serialization>
<RDF:assertions href="http://www.bar.com/some.doc">
<bib:author>John Smith</bib:author>
</RDF:assertions>
</RDF:serialization>

Here's the Bean-ified version;

An "RDF.serialization" Bean would contain an "RDF.assertions" Bean that
contained a class variable called "href" statically initialized to
that URL. That Bean would contain a "bib.author" Bean that contained
the string "John Smith". The namespaces obviously map directly to Java
packages (though the package naming conventions will need consolidating
with web namespace naming).

Follow? How's that for an ObjectWeb parallel?! When you compose Beans,
you're building structure that can be represented with a DTD. And
vice-versa.

The fact that, as I stated above, a DTD for RDF isn't required is no big
deal. So long as a set of composed Beans form a structure, that's what
matters. If that structure is expressible at design time via the Bean
classes, then its expressible explicitly as a DTD. If the structure is
expressible only at runtime, then that would parallel the non-DTD based
implicit document structure.

MB

--
Mark Baker, Ottawa Ontario CANADA.                Java, CORBA, XML, Beans
http://www.iosphere.net/~markb               distobj@acm.org  ICQ:5100069

Will distribute business objects for food.