Location: home > projects > XML Lite

XML Lite: Lightweight XML Parser for J2ME
What?   XML Lite is a quick and dirty non-validating XML parser targetting the J2ME platform. With that, one of the primary goals for the project is size. Though it is small, it features both a pull parser and a model parser. The model parser is a separate class dependent on the pull parser, and can be removed if not needed.

What's New?  
  • A very incomplete (j)unit test for XMLPullParser (XMLPullParserTest).
  • A couple of bug fixes to the comment parsing and PCDATA parsing found by the unit test ;).
  • A XML-RPC parser (that only depends on XMLPullParser). This parser is targetted towards J2ME, and does NOT parse doubles but instead treats them like strings. It also simply treats date and base64 data as strings. It also cannot convert a Double object into the proper XML as Double does not exist in the J2ME world.
What Doesn't It Do?  
  • It doesn't know/ignores the DTD.
  • It doesn't use any of the declarations (but can pass them on to the caller).
  • Because of that, it does not to any entity substitutions.
  • Or validation for attributes and elements.
The Goods   Documentation: Zipped archive, or browse online
Download: Zipped archive, or individually
License: MIT License (from opensource.org)

Facts   Platforms: Java, Java 2 MicroEdition
Language: Java
Tools: Java 2 SDK, Sun ONE Studio
Current Version: 2003/04/04
Status: active

Alternatives   Some alternatives for parsing XML on J2ME (the following are all model parsers): It may be wise to look at the alternatives, as they've been around much longer so they're probably more mature and stable. My test suite is not done yet.

Size Comparison   Since the primary hardware that will be running these parsers are mass produced mobile phones from margin-conscious, memory-frugal manufacturers, size is a large concern. The users will appreciate it too since mobile bandwidth costs them time and money as well.

The following is a table showing the total sizes of the parsers with model parsing support (uncompressed classes generated by javac 1.4.1).

XML Lite NanoXML Xparse-J Al Sutton's
10.7 KB 18.8 KB 11.3 KB 11.6 KB

Some notes:
  • XML Lite can be used without the model parser, yielding a total size of 5.94 KB.
  • NanoXML's size does not include the two Writer classes it comes with.
  • Al's includes the additional classes needed to perform model parsing. Its size without the model parsing classes is 4.56 KB.
  • There seems to be some fluctuations in size depending on whether I build the classes for J2ME or not. Suffice to say that NanoXML is larger, while the others are similarly sized.
Performance Comparison   Speed is always a good thing.

Benchmark description: The parsers were tested against four XML formatted strings. The timings are in milliseconds and represent the average amount of time (over five seperate tests) needed to parse the strings 5000 times. The parsers were bootstrapped once untimed for the benefit of JIT optimizations. Garbage collection was "recommended" between each 5000 test. Tests are now completely independent to prevent any sort of cross contamination. The code for the benchmark is here.

Benchmark environment: Pentium 2 - 333MHz with 256 MB of memory, running Windows 2000 and Java HotSpot(TM) Client VM (build 1.4.1_01-b01, mixed mode).

Description of test strings (the actual string can be found in the XMLBench code):
  • small - A very simple XML string ("<b>Hello World!</b>"). 18 characters long.
  • SOAP - Example 7 from the official SOAP specifications. 522 characters long.
  • deep - A simple XML string with tags that nest 10 deep. 592 characters long.
  • RSS - An example RSS file borrowed from the java.sun.com article, "Parsing XML in J2ME". 2694 characters long.
Name small (ms) SOAP (ms) deep (ms) RSS (ms)
Reader setup 1,866.8 2,259.4 2,337.2 4,224.0
XML Lite (20030304) 3,316.8 8838.8 10,517.2 30,562.0
NanoXML 5,377.4 11,550.4 18,058.0 31,034.4
Xparse-J 1,458.2 8,065.2 11,406.4 30,125.2
Al Sutton's 3,613.2 9,610.0 9,014.8 32554.6

Some notes:
  • This iteration of the benchmark moves XML Lite to a solid second place. The results still can not be compared with original results (I don't know why, but even the baseline numbers (reader setup) are much higher). I've only been able to reproduce the original set of numbers at my girlfriend's home (don't know why). It would be best to do your own benchmarking, I guess. Take mine with a grain of salt.
  • Reader setup does not parse the XML string. It simply creates a Reader object given a string. All parsers with the exception of Xparse-J used Readers as inputs. The results for Reader setup represent the overhead necessary to create the Readers given the string inputs.
  • NanoXML takes alternative types of input as well as Readers.
  • Xparse-J only accepts Strings, and thus does not have the overhead of constructing a Reader given a string. However if the input is not available as a string (say from an HttpConnection InputStream) additional overhead may be incurred to construct a string.
  • Results of the parsings are not checked, they are assumed to be correct.
Memory Use Comparison   Benchmark description: Separate MIDlets were created per parser and test (20 total). They were executed in the emulator and the bytes returned from the dynamic objects allocated line was used. The actual tests consisted of parsing the same strings as above five times (because I was impatient). The test code can be found here.

Name small (bytes) SOAP (bytes) deep (bytes) RSS (bytes)
Reader setup 24,696 30,232 30,984 54,120
XML Lite (20030304) 45,248 179,940 219,396 781,528
NanoXML 217,816 292,788 341,940 605,552
Xparse-J 53,712 243,100 352,256 942,908
Al Sutton's 48,028 191,368 175,760 790,304