[FoRK] Alternatives to XML config files... a hack

rst at ai.mit.edu rst at ai.mit.edu
Fri Feb 25 09:07:54 PST 2005

So, I've been playing with a hack that might be of interest to some
people here -- and particularly if you're interested in alterate
encodings of XML.

My major personal exposure to XML these days is in the config files of
various Java web application development toolkits.  (I'm using
Hibernate to deal with the database, Tapestry to handle UI-related
chores, and Spring for glue code and transaction management; each of
these has its own XML config files, as does Ant and the servlet
container machinery itself).  And I find myself feeling the same as a
lot of other folks out there -- it's nice that there's a common basic
syntax for all these things, but couldn't we have chosen one that's
easier for people to actually read?

One response to that is YAML -- a format which is actually *intended*
to be used for config files.  Unfortunately, though, I'm stuck with
the tools I have, which want XML inputs, and YAML can't be converted
unambiguously to XML.  So, what I've done instead is specified a
YAML-like syntax which is general enough to express everything in the
config files of all my tools.  (It's got no PIs, external unparsed
entities, or anything like that, but neither does my code).  While I'm
at it, I've thrown in a little context-sensitive macro processing,
which knocks back the size of Spring bean descriptors quite a bit; I'm
left with bean descriptors that look like this:

  bean "mySessionFactory" 
    "mappingDirectoryLocations": list: "classpath:"
    "dataSource": ref "myDataSource"
        props: "hibernate.dialect":  "${hibernate.dialect}"
               "hibernate.show_sql": "${hibernate.show_sql}"

which I find a whole lot easier to read than the original XML.

Having done this, the question is how to feed the new syntax to Spring
-- and every other toolkit I've got -- without rewriting all the
toolkits.  Here's where I've gotten a bit underhanded.  The Xerces XML
parser is advertised to have a pluggable artchitecture which allows
you to replace components -- including the document scanner.  So, I've
produced a Xerces configuration which has three content scanners,
instead of the usual two -- XML 1.0, XML 1.1, and a third, for my
mutant syntax, which is used for any document whose first non-blank
character is *not* '<'.  (I have to still be able to process standard
XML correctly, or the world would explode).  Other components of
Xerces -- the DTD scanner and DTD validator, for instance -- are still
there, so the results of the macroexpansion can be validated against
the original spring-beans.dtd, which is kind of nice.

If any of this sounds fun to play with, the code's here:


Have at it.


More information about the FoRK mailing list