[Earthweb] Simple XML = nothing but angle brackets, baby.

Date view Thread view Subject view Author view

From: Adam Rifkin -4K (adam@XeNT.ics.uci.edu)
Date: Tue Aug 15 2000 - 23:18:27 PDT


Is it true that most developers really only want this?
> Yet, in practice, few developers use or need all the features of XML.
> Most developers are comfortable with a small subset and most documents
> look like Listing 1: a list of data enclosed in tags and angle brackets.

[Insert random nostalgia about YML discussions here.

   http://www.cs.caltech.edu/~adam/papers/xml/yml.html ]

Found the following article at:

   http://gamelan.earthweb.com/earthweb/cda/dlink.resource-jhtml.72.1082.|repository||softwaredev|content|article|2000|08|10|SDmarchalmini|SDmarchalmini~xml.41.jhtml?cda=true

> Minimal XML and Java
>
> If Java is the programming language of the Internet, XML is the data
> format. It is a simple markup language, but unlike HTML, it is
> extensible -- meaning you can use any data model. Here, we look at a
> handy, small-scale version of XML.
>
> Published August 10, 2000
> By Benoit Marchal
>
> If Java is the programming language of the Internet, XML is the data
> format. It is a simple markup language, but unlike HTML, it is
> extensible -- meaning you can use any data model.
>
> With XML, the W3C's goals were to develop a simple markup language to
> fit a wide range of applications. In many respects, the W3C succeeded.
> XML is powerful but not too complicated that one cannot learn it in a
> few days.
>
> Reality Check
>
> Yet, in practice, few developers use or need all the features of XML.
> Most developers are comfortable with a small subset and most documents
> look like Listing 1: a list of data enclosed in tags and angle brackets.
>
> <Exchange>
> <Rate>
> <Currency>BEF</Currency>
> <Value>41.97</Value>
> </Rate>
> <Rate>
> <Currency>CAD</Currency>
> <Value>1.46</Value>
> </Rate>
> <Rate>
> <Currency>GBP</Currency>
> <Value>0.66</Value>
> </Rate>
> <Rate>
> <Currency>FRF</Currency>
> <Value>6.83</Value>
> </Rate>
> <Rate>
> <Currency>DEM</Currency>
> <Value>2.04</Value>
> </Rate>
> </Exchange>
> Listing 1: rates.min.
>
> In response to this situation, a group of developers have formed the
> SML-DEV mailing list to develop the Simple Markup Language:
>
> http://www.egroups.com/group/sml-dev
>
> So far their efforts have produced two documents:
>
> Common XML - http://www.docuverse.com/smldev/commonxml.html
> Minimal XML - http://www.docuverse.com/smldev/minxml.html
>
> Both are useful to Java developers, although for different reasons.
>
> Common XML defines a safe subset of XML. Indeed, even though XML is a
> standard, there are small differences in how vendors implement it. To
> avoid being burned by the differences, Common XML defines a subset that
> works reliably with every parser.
>
> Minimal XML
> The most interesting effort, however, is the Minimal XML (or MinXML)
> language.
> To understand MinXML, you must remember that XML has two main
> applications: publishing and data exchange. Publishers use XML to manage
> large Web sites or other documents. eCommerce vendors use XML for data
> exchange, synchronization, or enterprise application integration (eAI).
>
> Some constructs in XML exist solely for publishing applications. Mixed
> content is a prime example. Publishing applications often need to mix
> text and tags, such as:
>
> <para>Visit <xlink:simple
> href="http://www.gamelan.com">Gamelan</xlink:simple>.</para>
>
> However, mixed content is a distraction for data applications. Indeed,
> XML parsers pass indenting and spaces to applications that ignore them.
> In the following example, the XML parser reports three spaces before the
> Currency and Value elements, but the application probably would ignore
> the spaces:
>
> <Rate>
> <Currency>BEF</Currency>
> <Value>41.97</Value>
> </Rate>
>
> A Practical Subset
>
> SML-DEV proposes to further simplify XML for data applications, removing
> those options that were introduced for publishing with MinXML.
> This subset is particularly interesting for Java developers for at least
> two reasons:
>
> * Most applications that use Java and XML fall into the eCommerce or eAI
> range, so they will benefit from the tight focus on a practical subset.
>
> * MinXML is easier to implement. As we will see, MinXML parsers are
> dramatically smaller and sometimes faster than regular XML parsers.
>
> A word of warning: Before you decide that MinXML is the right solution
> for you, you must realize that it does not cover as many applications as
> plain XML. MinXML is heavily biased towards some data applications. If
> you are in eCommerce, it might be right for you. If you are in
> publishing, you need the full XML feature set. Also, remember that
> MinXML is a subset of XML. Every MinXML document is also an XML
> document, so you could use a regular XML parser but limit yourself to
> the MinXML subset.
>
> MinXML Parsers
>
> XML parsers have put on weight recently. Although James Clark's XP is a
> light 162 Kb, Apache's Xerces now weighs more than 1 Mb, and it keeps on
> growing. For some applications, size matters. In this respect, MinXML
> parsers shine, as they are typically less than 20 Kb.
>
> A simpler markup language means that the parser has less work, so it is
> faster, too. Early measurements show that MinXML parsers beat regular
> XML parsers hands down.
>
> As of this writing, four MinXML parsers are available:
>
> Min is the closest thing to an official MinXML parser. It supports the
> familiar SAX API and takes less than 20 Kb.
>
> John Wilson's Minimal XML parser is a tiny parser originally designed
> for embedded computing.
>
> Shawn Silverman's Minimal XML parser in Java fits in just one Java
> class.
>
> JaSMin by Sjoerd Visscher is a 50-line parser written in JavaScript.
>
> Min
>
> Using Min, the tiny MinXML parser, is no different from using a regular
> SAX parser, as Listing 2 demonstrates. Start by writing an event
> handler: Exchange inherits from HandlerBase and overwrites the
> startElement(), endElement() and characters() events. The parser will
> fire these events when it encounters a start tag, end tag, or character
> content, respectively.
>
> This event handler parses documents like Listing 1 and fills a
> java.util.Dictionary with exchange rates. The major issue when writing a
> SAX event handler is to track where the application is in the document.
> There are no direct relationships among the events, and the parser
> provides no structure. Fortunately, since an XML document is a tree, it
> is easy to use a java.util.Stack to track the position in the document.
>
> The main() method creates the parser, registers the event handler, and
> prints various exchange rates through the compute() method.
>
> import java.io.*;
> import java.util.*;
> import org.xml.sax.*;
> import org.xml.sax.helpers.*;
> import com.docuverse.min.util.*;
>
> public class Exchange
> extends HandlerBase
> {
> protected Stack stack = new Stack();
> protected Dictionary rates = new Hashtable();
>
> public void startElement(String name,
> AttributeList attributes)
> {
> if(name.equals("Currency"))
> stack.push(new StringBuffer());
> else if(name.equals("Value"))
> stack.push(new StringBuffer());
> }
>
> public void endElement(String name)
> {
> if(name.equals("Currency"))
> {
> StringBuffer c = (StringBuffer)stack.pop();
> stack.push(c.toString());
> }
> else if(name.equals("Value"))
> {
> StringBuffer v = (StringBuffer)stack.pop();
> stack.push(Double.valueOf(v.toString()));
> }
> else if(name.equals("Rate"))
> {
> Double v = (Double)stack.pop();
> String c = (String)stack.pop();
> rates.put(c,v);
> }
> }
>
> public void characters(char ch[], int start, int length)
> {
> Object o = stack.peek();
> if(o != null && o instanceof StringBuffer)
> ((StringBuffer)o).append(ch,start,length);
> }
>
> public void compute(Writer writer,
> double amount)
> throws IOException
> {
> Enumeration keys = rates.keys();
> while(keys.hasMoreElements())
> {
> String currency = (String)keys.nextElement();
> double rate =
> ((Double)rates.get(currency)).doubleValue();
> writer.write(currency);
> writer.write(": ");
> writer.write(Double.toString(amount * rate));
> writer.write('\n');
> }
> writer.flush();
> }
>
> static public void main(String[] args)
> throws IOException, SAXException,
> InstantiationException, ClassNotFoundException,
> IllegalAccessException
> {
> Parser parser =
> ParserFactory.makeParser("com.docuverse.min.Parser");
> Exchange handler = new Exchange();
> parser.setDocumentHandler(handler);
> parser.parse(args[0]);
> handler.compute(new OutputStreamWriter(System.out),
> Double.valueOf(args[1]).doubleValue());
> }
> }
>
> Listing 2: Exchange.java.
>
> Build and Run the Project
>
> To run this project, you must download Min from Docuverse.com and the
> SAX1 interface from Megginson.com. Make sure both are in your classpath
> to compile and run the application.
>
> You must provide two parameters on the command-line, the first one is
> the MinXML file with exchange rates, the second is the amount you want
> to use. Assuming you save Listing 1 as exchange.min, the command might
> look like:
>
> java Exchange exchange.min 40
>
> Conclusion
>
> The jury is still out on MinXML. Some longtime XML experts think it is a
> futile attempt, because it is too simple. However, if you keep in mind
> the original goal (a small subset for specific applications), you could
> benefit from a smaller, faster parser.
>
> About the Author
>
> Benont Marchal is a software engineer and writer who has been working
> extensively in Java and XML. He is the author of the book XML by
> Example. A second XML book is due in September. His web page is at
> http://www.pineapplesoft.com/

----
Adam@4K-Associates.Com

Bitmaps aren't scalable, right? You can't do anything with them. They're nonsensical. Bitmaps are not graphics; they're the display result of graphics. You can't express graphics in dots, and a bitmap does not have a metric. It has no meaning. -- Robert Cailliau


Date view Thread view Subject view Author view

This archive was generated by hypermail 2b29 : Tue Aug 15 2000 - 23:18:51 PDT