Re: Climbing Clueful Mountain

Mark Baker (
Wed, 23 Apr 1997 00:01:09 -0400

Excellent post, sorry I didn't reply sooner. I've spent some
time trying to figure out what the TimBL followers were trying
to say with documents and markup. I think I've figured it out.

X-posting to dist-obj since it continues some posts I made
recently. Please followup to both lists.

At 09:08 PM 4/4/97 -0500, Rohit Khare wrote:
>Compound documents are NOT training wheel technology -- they are deeply
>and fundamentally correct abstrations for global coordination.

You don't have to convince me - just Ron. My issue with them is
simply nomenclature; places *are* documents. No biggie though - I
can live with "document".

>One of the critical insights of the Web (and Gopher, etc, before it to
>varying degrees) is that _everything is a document_. A document can
>capture the state of a computation nearly completely, pickle it for a
>human, and re-present the same memes elsewhere. There is brilliance in
>deconstructing "online services" from an operational,
>draw-some-stuff-on-the-screen, take-input, repeat cycle to one with
>explicit, declarative cutpoints: 'this IS the state of the app at this

I agree, though I'd be happier if it were phrased as "this is
the state of these objects at this point". Just nitpicking.
I hate the word "application". It means nothing to me.

>This technology has been taken all the way to the limit already,
>with WinFrame and Broadway "documents" which gate fully interactive c/s
>apps over the Web. And HTML is one particularly good format for this,
>though there will be others.

HTML is good for this? How so? I can see XML, since it can carry
arbitrary context (via tags) within the document. HTML has no
context, unless you count headings, tables, paragraphs, etc..

I don't count that as a valid context. To me, a valid context is
<person>Mark Baker</person>, not <h1>Mark Baker</h1>. Even then,

>Let's talk some more about why documents (persumably, in place of RPC).

Not "in place of", more "in conjunction with". More on this below.

>DanC and I have had long arguments about where the distinction blurs. We
>decided we *could* talk about more ephemeral "artifacts" (e.g. the
>return vale "4" for "2+2") and longer-lived "entities" (web pages). As
>far as I can argue, though, these are minor aspects of 'intent' -- where
>do the bits-on-the-wire change? Well, in the RPC scenario, the bits
>finally collapse into a stream of interdependent transactions and the
>state of the system cannot be neatly extracted into a summary document.

I don't follow. Why, using an object based structured storage
mechanism (eg. Quilt), can't a set of objects be streamed into a
document? Serialization doesn't make an object lose its identity,
nor its ability to have a message chucked its way (though it will
need to be transparently reactivated).

Documents can be operated on within transactional contexts. Their
contained objects share this context. As such, the state of the
system exists within that document and context.

>RPC bits become defined on the wire by the whims of protocol-stub
>compilers (IDL, RMI, etc). They lose explicit meta-information about
>creator, modification date, cacheability, and most of all, they lose
>ther name: artifacts can not be addressed de novo.

Oliver Sims talks about this at great length in his book. His Newi
product/infrastructure relays information-with-context on-the-wire
(semantic data streams - SDS).

The OMG strayed from this approach. Their strategy is to delegate
meta-data to a Meta Object Facility, thereby requiring a client to
use the infrastructure to interpret it. I wonder if the apparent
gap between Web/dist-obj types is due primarily to this difference?
It was certainly the chasm I needed to cross before I figured I
understood where you guys were coming from.

I like the simplicity of contextually tagged markup (XML), but I
appreciate the robustness of the MOF (central, though federated
repository of all meta data).

>Entities always have names, even if they're short-lived radiated bits
>emanating from an unknowable oracle (like a CGI script or a webcam).
>Each page I see has a name, cache tags, etc. Most of all, pages aim to
>explain one concept to one human being (we don't take well to "here are
>twenty pieces of your response, you piece it together"). This increases
>the *semantic* grain to being "a quantum of useful information to a

Yep. This is why I like "places". "Here's an office, you'll
find every person and thing you need to do your job in this place".

>So now we have two identifiable differences between artifacts and
>entities... and I think I still argue that artifacts are unnecessary --
>you can choose to view the results of a remote method invocation as a
>really small "document".

It is! I don't think there's any "choose" here. So long as the
result of this method invocation returns self-describing content,
isn't it, by definition, a document?

>What you gain is fractal self-similarity (it's
>documents all the way down) and you gain a limiting bounding factor on
>complexity. Just as humans choose to communicate in sentences (at worst,
>identifiable fragments) rather than in Morse code (a "bit channel"
>instead of a "thought channel"), I think the minimum unit of
>composability should be some recognizable, economically-identifiable
>task ("record Seinfeld") instead of an operational recipe ("turn on the
>tv; use IR remote to change channel to 7; send record() message;")

The recipe is encapsulated within a task object (script or business
process object). The task object is a document, by defintion, but
I don't see why "record Seinfeld" is any more recognizable than
"I'm away from the house on Thursday night".

Moreover, "turn on the tv" is a recipe in another context.

You can't mandate granularity. The best you can do is provide the
ability to compose self-similarly.

><Random trivia: there have been 345 billion OREOs manufactured to date>

That's a lot of tongue cramps!

>-- was that sentence a document? It could be, I think it should be. We
>can leverage so much now: I can cache this fact, annotate it with a
>confidence rating, protect it, etc.

Yes, it's definitely a document.

>[at this point, adam will have to kick in to translate - I can't explain
>it better right now. Probably the three glasses of Amaretto di Sanorro
>I've had during this edit session...]


>In short, I think documents-representing-task-state ARE how our pagers
>and cars and yes, even diapers, will be composed and coordinated into
>larger systems. (see d-o). Places are just another collection of links
>in the form of a document...

That's another problem I have with markup. Why is a place a collection
of links? Why isn't it a document of objects, "stored" by value, with
their identity intact?

>I would like to tap Ernie's argument, too: that documents make an
>appropriate, self-similar building block for the information universe
>(no TM).


(Transcendental Meditation?)

>My long-term battle is yanking the Internet service model up
>from the minutae of IP packets to this level precisely because we can
>only then apply document-semantics-enhancements like
>payment-for-information value, security, store-and-forward routing (over
>munchkin nets)...

I agree if we're talking about documents as the results of method


Mark Baker, Ottawa Ontario CANADA.               Java, CORBA, OpenDoc, Beans,   

Will distribute business objects for food.