Re: [Earthweb] Simple XML = nothing but angle brackets, baby.

Date view Thread view Subject view Author view

From: Lucas Gonze (lucas@worldos.com)
Date: Wed Aug 16 2000 - 10:58:44 PDT


Yeah man, that 1MB for Xerces is a killer. No matter how much I slim down
my app, Xerces keeps it a big big. I notice that Magi also bundles
Xerces....

The original WorldOS protocol was on a minimalist track (as far as XML
featureset) so that parsers could be small. But now we're planning to add a
SOAP 1.1 transport handler, which uses a lot more XML features. Anybody
know if MinXML covers everything in SOAP 1.1?

----- Original Message -----
From: Adam Rifkin -4K <adam@XeNT.ics.uci.edu>
To: <FoRK@XeNT.CoM>
Sent: Tuesday, August 15, 2000 11:18 PM
Subject: [Earthweb] Simple XML = nothing but angle brackets, baby.

> Is it true that most developers really only want this?
> > Yet, in practice, few developers use or need all the features of XML.
> > Most developers are comfortable with a small subset and most documents
> > look like Listing 1: a list of data enclosed in tags and angle brackets.
>
> [Insert random nostalgia about YML discussions here.
>
> http://www.cs.caltech.edu/~adam/papers/xml/yml.html ]
>
> Found the following article at:
>
>
http://gamelan.earthweb.com/earthweb/cda/dlink.resource-jhtml.72.1082.|repos
itory||softwaredev|content|article|2000|08|10|SDmarchalmini|SDmarchalmini~xm
l.41.jhtml?cda=true
>
>
> > Minimal XML and Java
> >
> > If Java is the programming language of the Internet, XML is the data
> > format. It is a simple markup language, but unlike HTML, it is
> > extensible -- meaning you can use any data model. Here, we look at a
> > handy, small-scale version of XML.
> >
> > Published August 10, 2000
> > By Benoit Marchal
> >
> > If Java is the programming language of the Internet, XML is the data
> > format. It is a simple markup language, but unlike HTML, it is
> > extensible -- meaning you can use any data model.
> >
> > With XML, the W3C's goals were to develop a simple markup language to
> > fit a wide range of applications. In many respects, the W3C succeeded.
> > XML is powerful but not too complicated that one cannot learn it in a
> > few days.
> >
> > Reality Check
> >
> > Yet, in practice, few developers use or need all the features of XML.
> > Most developers are comfortable with a small subset and most documents
> > look like Listing 1: a list of data enclosed in tags and angle brackets.
> >
> > <Exchange>
> > <Rate>
> > <Currency>BEF</Currency>
> > <Value>41.97</Value>
> > </Rate>
> > <Rate>
> > <Currency>CAD</Currency>
> > <Value>1.46</Value>
> > </Rate>
> > <Rate>
> > <Currency>GBP</Currency>
> > <Value>0.66</Value>
> > </Rate>
> > <Rate>
> > <Currency>FRF</Currency>
> > <Value>6.83</Value>
> > </Rate>
> > <Rate>
> > <Currency>DEM</Currency>
> > <Value>2.04</Value>
> > </Rate>
> > </Exchange>
> > Listing 1: rates.min.
> >
> > In response to this situation, a group of developers have formed the
> > SML-DEV mailing list to develop the Simple Markup Language:
> >
> > http://www.egroups.com/group/sml-dev
> >
> > So far their efforts have produced two documents:
> >
> > Common XML - http://www.docuverse.com/smldev/commonxml.html
> > Minimal XML - http://www.docuverse.com/smldev/minxml.html
> >
> > Both are useful to Java developers, although for different reasons.
> >
> > Common XML defines a safe subset of XML. Indeed, even though XML is a
> > standard, there are small differences in how vendors implement it. To
> > avoid being burned by the differences, Common XML defines a subset that
> > works reliably with every parser.
> >
> > Minimal XML
> > The most interesting effort, however, is the Minimal XML (or MinXML)
> > language.
> > To understand MinXML, you must remember that XML has two main
> > applications: publishing and data exchange. Publishers use XML to manage
> > large Web sites or other documents. eCommerce vendors use XML for data
> > exchange, synchronization, or enterprise application integration (eAI).
> >
> > Some constructs in XML exist solely for publishing applications. Mixed
> > content is a prime example. Publishing applications often need to mix
> > text and tags, such as:
> >
> > <para>Visit <xlink:simple
> > href="http://www.gamelan.com">Gamelan</xlink:simple>.</para>
> >
> > However, mixed content is a distraction for data applications. Indeed,
> > XML parsers pass indenting and spaces to applications that ignore them.
> > In the following example, the XML parser reports three spaces before the
> > Currency and Value elements, but the application probably would ignore
> > the spaces:
> >
> > <Rate>
> > <Currency>BEF</Currency>
> > <Value>41.97</Value>
> > </Rate>
> >
> > A Practical Subset
> >
> > SML-DEV proposes to further simplify XML for data applications, removing
> > those options that were introduced for publishing with MinXML.
> > This subset is particularly interesting for Java developers for at least
> > two reasons:
> >
> > * Most applications that use Java and XML fall into the eCommerce or eAI
> > range, so they will benefit from the tight focus on a practical subset.
> >
> > * MinXML is easier to implement. As we will see, MinXML parsers are
> > dramatically smaller and sometimes faster than regular XML parsers.
> >
> > A word of warning: Before you decide that MinXML is the right solution
> > for you, you must realize that it does not cover as many applications as
> > plain XML. MinXML is heavily biased towards some data applications. If
> > you are in eCommerce, it might be right for you. If you are in
> > publishing, you need the full XML feature set. Also, remember that
> > MinXML is a subset of XML. Every MinXML document is also an XML
> > document, so you could use a regular XML parser but limit yourself to
> > the MinXML subset.
> >
> > MinXML Parsers
> >
> > XML parsers have put on weight recently. Although James Clark's XP is a
> > light 162 Kb, Apache's Xerces now weighs more than 1 Mb, and it keeps on
> > growing. For some applications, size matters. In this respect, MinXML
> > parsers shine, as they are typically less than 20 Kb.
> >
> > A simpler markup language means that the parser has less work, so it is
> > faster, too. Early measurements show that MinXML parsers beat regular
> > XML parsers hands down.
> >
> > As of this writing, four MinXML parsers are available:
> >
> > Min is the closest thing to an official MinXML parser. It supports the
> > familiar SAX API and takes less than 20 Kb.
> >
> > John Wilson's Minimal XML parser is a tiny parser originally designed
> > for embedded computing.
> >
> > Shawn Silverman's Minimal XML parser in Java fits in just one Java
> > class.
> >
> > JaSMin by Sjoerd Visscher is a 50-line parser written in JavaScript.
> >
> > Min
> >
> > Using Min, the tiny MinXML parser, is no different from using a regular
> > SAX parser, as Listing 2 demonstrates. Start by writing an event
> > handler: Exchange inherits from HandlerBase and overwrites the
> > startElement(), endElement() and characters() events. The parser will
> > fire these events when it encounters a start tag, end tag, or character
> > content, respectively.
> >
> > This event handler parses documents like Listing 1 and fills a
> > java.util.Dictionary with exchange rates. The major issue when writing a
> > SAX event handler is to track where the application is in the document.
> > There are no direct relationships among the events, and the parser
> > provides no structure. Fortunately, since an XML document is a tree, it
> > is easy to use a java.util.Stack to track the position in the document.
> >
> > The main() method creates the parser, registers the event handler, and
> > prints various exchange rates through the compute() method.
> >
> > import java.io.*;
> > import java.util.*;
> > import org.xml.sax.*;
> > import org.xml.sax.helpers.*;
> > import com.docuverse.min.util.*;
> >
> > public class Exchange
> > extends HandlerBase
> > {
> > protected Stack stack = new Stack();
> > protected Dictionary rates = new Hashtable();
> >
> > public void startElement(String name,
> > AttributeList attributes)
> > {
> > if(name.equals("Currency"))
> > stack.push(new StringBuffer());
> > else if(name.equals("Value"))
> > stack.push(new StringBuffer());
> > }
> >
> > public void endElement(String name)
> > {
> > if(name.equals("Currency"))
> > {
> > StringBuffer c = (StringBuffer)stack.pop();
> > stack.push(c.toString());
> > }
> > else if(name.equals("Value"))
> > {
> > StringBuffer v = (StringBuffer)stack.pop();
> > stack.push(Double.valueOf(v.toString()));
> > }
> > else if(name.equals("Rate"))
> > {
> > Double v = (Double)stack.pop();
> > String c = (String)stack.pop();
> > rates.put(c,v);
> > }
> > }
> >
> > public void characters(char ch[], int start, int length)
> > {
> > Object o = stack.peek();
> > if(o != null && o instanceof StringBuffer)
> > ((StringBuffer)o).append(ch,start,length);
> > }
> >
> > public void compute(Writer writer,
> > double amount)
> > throws IOException
> > {
> > Enumeration keys = rates.keys();
> > while(keys.hasMoreElements())
> > {
> > String currency = (String)keys.nextElement();
> > double rate =
> > ((Double)rates.get(currency)).doubleValue();
> > writer.write(currency);
> > writer.write(": ");
> > writer.write(Double.toString(amount * rate));
> > writer.write('\n');
> > }
> > writer.flush();
> > }
> >
> > static public void main(String[] args)
> > throws IOException, SAXException,
> > InstantiationException, ClassNotFoundException,
> > IllegalAccessException
> > {
> > Parser parser =
> > ParserFactory.makeParser("com.docuverse.min.Parser");
> > Exchange handler = new Exchange();
> > parser.setDocumentHandler(handler);
> > parser.parse(args[0]);
> > handler.compute(new OutputStreamWriter(System.out),
> > Double.valueOf(args[1]).doubleValue());
> > }
> > }
> >
> > Listing 2: Exchange.java.
> >
> > Build and Run the Project
> >
> > To run this project, you must download Min from Docuverse.com and the
> > SAX1 interface from Megginson.com. Make sure both are in your classpath
> > to compile and run the application.
> >
> > You must provide two parameters on the command-line, the first one is
> > the MinXML file with exchange rates, the second is the amount you want
> > to use. Assuming you save Listing 1 as exchange.min, the command might
> > look like:
> >
> > java Exchange exchange.min 40
> >
> > Conclusion
> >
> > The jury is still out on MinXML. Some longtime XML experts think it is a
> > futile attempt, because it is too simple. However, if you keep in mind
> > the original goal (a small subset for specific applications), you could
> > benefit from a smaller, faster parser.
> >
> > About the Author
> >
> > Benont Marchal is a software engineer and writer who has been working
> > extensively in Java and XML. He is the author of the book XML by
> > Example. A second XML book is due in September. His web page is at
> > http://www.pineapplesoft.com/
>
> ----
> Adam@4K-Associates.Com
>
> Bitmaps aren't scalable, right? You can't do anything with them.
> They're nonsensical. Bitmaps are not graphics; they're the display
> result of graphics. You can't express graphics in dots, and a bitmap
> does not have a metric. It has no meaning.
> -- Robert Cailliau
>


Date view Thread view Subject view Author view

This archive was generated by hypermail 2b29 : Wed Aug 16 2000 - 08:08:33 PDT