Generalizing the SGML/XML information model and Releasing MONDO (fwd)

Mark Baker (markb@iosphere.net)
Thu, 20 Nov 1997 14:44:31 -0500 (EST)


It looks like somebody beat us to it. Cool. Caltech alumnus too.

Here's the paragraph I'm referring to;

"MONDO will also have a reference implementation in Java (prototypes were
in Java, Perl, and Smalltalk). The current reference implementation
includes frameworks and tools for the normal document-oriented tasks and
also for some more general or object-oriented capabilities. As an
example of the later, MONDO can serialize and deserialize Java objects
to human readable (XML or OML) encodings."

I wonder if he posted this in response to my xml-dev post yesterday?
Quirky timing.

Be sure to check out http://www.chimu.com. Interesting stuff.

MB

--
Mark Baker, Ottawa Ontario CANADA.                Java, CORBA, XML, Beans
http://www.iosphere.net/~markb               distobj@acm.org  ICQ:5100069

Will distribute business objects for food.

---------- Forwarded message ---------- Date: Thu, 20 Nov 1997 10:56:11 -0800 (PST) From: Mark L. Fussell <fussellm@alumni.caltech.edu> To: xml-dev@ic.ac.uk Subject: Generalizing the SGML/XML information model and Releasing MONDO

[This is a long email so I will also put it online at "http://www.chimu.com/projects/mondo/" ]

Some recent discussions on xml-dev and c.t.sgml have included query languages, encoding complex information (trees, graphs, etc.), object serialization, and extended metamodeling. I recommend enlarging the scope of these discussions and thinking about aligning SGML/XML with other disciplines that can help accomplish these tasks. This aligning would take advantage of the tools and techniques that are already available in other industries: not just by duplication of design but by actually merging with more general capabilities. Although alignment has been successfully done in some areas of SGML/XML I think it is conspicuously lacking in a crucial place: SGML's information model. By improving this particular weakness in SGML by taking advantage of well-established industries, an abundance of other needs become much more easily satisfied.

Generalizing the SGML/XML information model -------------------------------------------

The desired applications of SGML/XML have grown beyond the original focus on documents towards working with much more general information and processing. SGML is a combination of encoding technology and an information modeling language. But that modeling language (DTDs and Groves) is very weak and is constrained by being focused on document-oriented information. It is also esoteric and not equivalent to any of the mainstream information modeling approaches.

I recommend considering modeling separately from encoding technology. For modeling I think object-oriented information models can subsume SGML's document-oriented models and provide the ability to handle much more advanced models. Object-oriented information models can be very general, expressive, and understandable. This allows them to model many types of information equally well: both document-oriented and more general information. The strength of object-oriented information modeling has resulted in an abundance of good analysis, patterns, and specific models being built using it.

This last point is the most important. If SGML/XML aligns with the information modeling industry, many more tools will immediately become available. For describing models you can use the Unified Modeling Language (UML) and tools such as Rational Rose (and several other techniques and tools). Implementing models can be done very easily with most OO languages (with or without generic frameworks), and the resulting implementation can be far more knowledgeable about the semantics of the information it is working with. There are many products that provide persistence and UI presentation that are designed to work with OO DomainModels. There are standard query languages (OQL/SQL) and interface languages (CORBA/IDL). The information modeling industry provides an extensive list of high-quality technologies, standards, and techniques.

There has been a lot of great work done with SGML/XML in both modeling (DTDs) and technologies (e.g. HyTime). If this quality work is integrated into the common environment of OO information modeling and OO technologies then it will be available to a larger audience. It will also frequently become easier to understand and more capable because it can take advantage of the inherent abilities of OO models. For example, much of HyTime addressing is very easily and flexibly described in terms of object associations. HyTime becomes more powerful in the general object context.

This isn't to say everything is easy. There are still the issues of how to work with different information models on different technologies (e.g. how smart the objects are) and what additional technologies need to be provided to reproduce expected SGML functionality (e.g. like HyTime or extending (through object-methods) OQL with containment-closure abilities). And some tools would never be generalized because the SGML DTD&Grove model are sufficient for the task or the tool is too high a quality to risk moving (e.g. Jade).

Overall, I think the benefits will be enormous.

MONDO -----

I have been working on a project (called MONDO) to prove the benefits of this alignment and to provide an architecture and the frameworks to support it. MONDO is primarily an architecture: it describes the components (e.g. ObjectBuilder, DomainModel, ObjectEncoder), their responsibilities, and the interfaces among those components. It is meant to be open and language neutral.

MONDO will also have a reference implementation in Java (prototypes were in Java, Perl, and Smalltalk). The current reference implementation includes frameworks and tools for the normal document-oriented tasks and also for some more general or object-oriented capabilities. As an example of the later, MONDO can serialize and deserialize Java objects to human readable (XML or OML) encodings.

I have been working on MONDO for quite a while and been producing tangibles (i.e. designs, documentation, and code) off and on for a bit more than a year. This is the first time I am releasing them openly. The WWW site currently has some FAQ's, some references (extracted from the design document), and placeholders and timelines for expected additions. The references may be especially useful because they provide a sampling of the integration from these multiple fields. I hope to have the design document (first pass is about 80 pages) up on the web site by early next week and will start putting up the reference code shortly thereafter.

The MONDO WWW site is at: http://www.chimu.com/projects/mondo/

As an example (teaser ;-) of the MONDO design, I have included a couple (non-sequential but related) paragraphs below.

====== ObjectBuilder The responsibility of the ObjectBuilder is to build all or part of the Objectbase from an external source. Generally this source will be a human-readable text file, but there are several stages to ObjectBuilding which can each have different approaches (e.g. we could read from a binary file instead). Assuming we have a textual file-based approach, ObjectBuilding would go through three stages: Read from the text file and produce a stream of text Parse the text and turn it into a recipe (what objects to build and what ingredients to use) Build the recipe and construct objects within the DomainModel

-------

Recipes for building objects A recipe describes how to build a collection of associated objects. All the information that is placed into the DomainModel by MONDO is the result of building recipes. By formalizing recipes we separate the encoding of information (e.g. whether it is human readable and how to parse it) from what information is in the encoding. MONDO uses that information to construct the knowledge in a form we want to work with, the Objectbase. ======

Any feedback on MONDO or these concepts is appreciated and I hope they contribute to some of the topics that have been addressed recently. I will let people know when the main design document is on line and when the code to work with is downloadable. If you are interested in MONDO for your application or want to help with the project, let me know.

--Mark mark.fussell@chimu.com

i ChiMu Corporation Architectures for Information h M info@chimu.com Object-Oriented Information Systems C u www.chimu.com Architecture, Frameworks, and Mentoring

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)