Apr 17, 2008

XML - Application Programming Interfaces (APIs)

Overview Java XML API

The most important decision you'll make at the start of an XML project is the application- programming interface (API) you'll use. Many APIs are implemented by multiple vendors, so if the specific parser gives you trouble you can swap in an alternative, often without even recompiling your code. However, if you choose the wrong API, changing to a different one may well involve redesigning and rebuilding the entire application from scratch. Of course, as Fred Brooks taught us, “In most projects, the first system built is barely usable. It may be too slow, too big, awkward to use, or all three. There is no alternative but to start again, smarting but smarter, and build a redesigned version in which these problems are solved.… Hence plan to throw one away; you will, anyhow. ” [1] Still, it is much easier to change parsers than APIs.

There are two major standard APIs for processing XML documents with Java, the Simple API for XML (SAX) and the Document Object Model (DOM), each of which comes in several versions. In addition there are a host of other, somewhat idiosyncratic APIs including JDOM, dom4j, ElectricXML, and XMLPULL. Finally each specific parser generally has a native API that it exposes below the level of the standard APIs. For instance, the Xerces parser has the Xerces Native Interface (XNI). However, picking such an API limits your choice of parser, and indeed may even tie you to one particular version of the parser since parser vendors tend not to worry a great deal about maintaining native compatibility between releases. Each of these APIs has its own strengths and weaknesses.

SAX
SAX, the Simple API for XML, is the gold standard of XML APIs. It is the most complete and correct by far. Given a fully validating parser that supports all its optional features, there is very little you can’t do with it. It has one or two holes, but they're really off in the weeds of the XML specifications, and you have to look pretty hard to find them. SAX is a event driven API. The SAX classes and interfaces model the parser, the stream from which the document is read, and the client application receiving data from the parser. However, no class models the XML document itself. Instead the parser feeds content to the client application through a callback interface, much like the ones used in Swing and the AWT.
This makes SAX very fast and very memory efficient (since it doesn’t have to store the entire document in memory). However, SAX programs can be harder to design and code because you normally need to develop your own data structures to hold the content from the document. SAX works best when your processing is fairly local; that is, when all the information you need to use is close together in the document. For example, you might process one element at a time. Applications that require access to the entire document at once in order to take useful action would be better served by one of the tree-based APIs like DOM or JDOM. Finally, because SAX is so efficient, it’s the only real choice for truly huge XML documents. Of course, “truly huge” has to be defined relative to available memory. However, if the documents you're processing are in the gigabyte range, you really have no choice but to use SAX.

DOM
DOM, the Document Object Model, is a fairly complex API that models an XML document as a tree. Unlike SAX, DOM is a read-write API. It can both parse existing XML documents and create new ones. Each XML document is represented as Document object. Documents are searched, queried, and updated by invoking methods on this Document object and the objects it contains. This makes DOM much more convenient when random access to widely separated parts of the original document is required. However, it is quite memory intensive compared to SAX, and not nearly as well suited to streaming applications.

JAXP
JAXP, the Java API for XML Processing, bundles SAX and DOM together along with some factory classes and the TrAX XSLT API. (TrAX is not a general purpose XML API like SAX and DOM. I'll get to it in Chapter 17.) It is a standard part of Java 1.4 and later. However, it is not really a different API. When starting a new program, you ask yourself whether you should choose SAX or DOM. You don’t ask yourself whether you should use SAX or JAXP, or DOM or JAXP. SAX and DOM are part of JAXP.

JDOM
JDOM is a Java-native tree-based API that attempts to remove a lot of DOM’s ugliness. The JDOM mission statement is, “There is no compelling reason for a Java API to manipulate XML to be complex, tricky, unintuitive, or a pain in the neck,” and for the most part JDOM delivers. Like DOM, JDOM reads the entire document into memory before it begins to work on it; and the broad outline of JDOM programs tends to be the same as for DOM programs. However, the low-level code is a lot less tricky and ugly than the DOM equivalent. JDOM uses concrete classes and constructors rather than interfaces and factory methods. It uses standard Java coding conventions, methods, and classes throughout. JDOM programs often flow a lot more naturally than the equivalent DOM program. I think JDOM often does make the easy problems easier; but in my experience JDOM also makes the hard problems harder. Its design shows a very solid understanding of Java, but the XML side of the equation feels much rougher. It’s missing some crucial pieces like a common node interface or superclass for navigation. JDOM works well (and much better than DOM) on fairly simple documents with no recursion, limited mixed content, and a well-known vocabulary. It begins to show some weakness when asked to process arbitrary XML. When I need to write programs that operate on any XML document, I tend to find DOM simpler despite its ugliness.

Download XML - Application Programming Interfaces (APIs)

Windows Vista for Starters: The Missing Manual

Table of Content Windows Vista for Starters: The Missing Manual

Chapter 1: Welcome Center, Desktop, and the Start Menu Chapter 2: Explorer, Windows, and the Taskbar Chapter 3: Searching and Organizing Your Files
Chapter 4: Interior Decorating Vista Chapter 5: Getting Help Chapter 6: Programs, Documents, and Gadgets
Chapter 7: The Freebie Software Chapter 8: The Control Panel Chapter 9: Hooking Up to the Internet
Chapter 10: Internet Security Chapter 11: Internet Explorer 7 Chapter 12: Windows Mail
Chapter 13: Windows Photo Gallery Chapter 14: Windows Media Player Chapter 15: Movie Maker and DVD Maker
Chapter 16: Media Center Chapter 17: Fax, Print, and Scan Chapter 18: Hardware
Chapter 19: Laptops, Tablets, and Palmtops Chapter 20: Maintenance and Speed Tweaks Chapter 21: The Disk Chapter
Chapter 22: Backups and Troubleshooting Chapter 23: Accounts (and Logging On) Chapter 24: Setting Up a Workgroup Network
Chapter 25: Network Domains Chapter 26: Network Sharing and Collaboration Chapter 27: Vista by Remote Control


Fast-paced and easy to use, this concise book teaches you the basics of Windows Vista so you can start using this operating system right away. Written by "New York Times" columnist, bestselling author, Emmy-winning CBS News correspondent and Missing Manuals creator David Pogue, the book will help you:
  • Navigate the desktop, including the fast, powerful and fully integrated desktop search function
  • Use the Media Center to record TV and radio, present photos, play music, and record all of these to a DVD
  • Breeze across the Web with the vastly improved Internet Explorer 7 tabbed browser
  • Become familiar with Vista's beefed up security, and much more

Windows Vista is a vast improvement over its predecessors, with an appealing, glass-like visual overhaul, superior searching and organization tools, a multimedia and collaboration suite, and a massive, top-to-bottom security-shield reconstruction. Every corner of the traditional Windows operating system has been tweaked, overhauled, or replaced entirely.

Aimed at new and experienced computer users alike, Windows Vista for Starters: The Missing Manual is right there when you need it. This jargon-free book explains Vista's features quickly and clearly -- revealing which work well and which don't.

Download Windows Vista for Starters: The Missing Manual