Java and XML: Solutions to Real-World Problems

Java and XML: Solutions to Real-World Problems

Java and XML: Solutions to Real-World Problems

Java and XML: Solutions to Real-World Problems

Paperback(Third Edition)

$49.99 
  • SHIP THIS ITEM
    Qualifies for Free Shipping
  • PICK UP IN STORE
    Check Availability at Nearby Stores

Related collections and offers


Overview

Java and XML, 3rd Edition, shows you how to cut through all the hype about XML and put it to work. It teaches you how to use the APIs, tools, and tricks of XML to build real-world applications. The result is a new approach to managing information that touches everything from configuration files to web sites.

After two chapters on XML basics, including XPath, XSL, DTDs, and XML Schema, the rest of the book focuses on using XML from your Java applications. This third edition of Java and XML covers all major Java XML processing libraries, including full coverage of the SAX, DOM, StAX, JDOM, and dom4j APIs as well as the latest version of the Java API for XML Processing (JAXP) and Java Architecture for XML Binding (JAXB). The chapters on web technology have been entirely rewritten to focus on the today's most relevant topics: syndicating content with RSS and creating Web 2.0 applications. You'll learn how to create, read, and modify RSS feeds for syndicated content and use XML to power the next generation of websites with Ajax and Adobe Flash.

Topics include:

  • The basics of XML, including DTDs, namespaces, XML Schema, XPath, and Transformations
  • The SAX API, including all handlers, filters, and writers
  • The DOM API, including DOM Level 2, Level 3, and the DOM HTML module
  • The JDOM API, including the core and a look at XPath support
  • The StAX API, including StAX factories, producing documents and XMLPull
  • Data Binding with JAXB, using the new JAXB 2.0 annotations
  • Web syndication and podcasting with RSS
  • XML on the Presentation Layer, paying attention to Ajax and Flash applications

If you are developing with Java and need to use XML, or think that you will be in the future; if you're involved in the new peer-to-peer movement, messaging, or web services; or if you're developing software for electronic commerce, Java and XML will be an indispensable companion.


Product Details

ISBN-13: 9780596101497
Publisher: O'Reilly Media, Incorporated
Publication date: 12/28/2006
Edition description: Third Edition
Pages: 479
Product dimensions: 7.00(w) x 9.19(h) x 0.85(d)

About the Author

Brett McLaughlin is a bestselling and award-winning non-fiction author. His books on computer programming, home theater, and analysis and design have sold in excess of 100,000 copies. He has been writing, editing, and producing technical books for nearly a decade, and is as comfortable in front of a word processor as he is behind a guitar, chasing his two sons and his daughter around the house, or laughing at reruns of Arrested Development with his wife.



Brett spends most of his time these days on cognitive theory, codifying and expanding on the learning principles that shaped the Head First series into a bestselling phenomenon. He's curious about how humans best learn, why Star Wars was so formulaic and still so successful, and is adamant that a good video game is the most effective learning paradigm we have.

Justin Edelson is the Vice President of Platform Engineering for MTV Networks. He was the co-author (with Brett McLaughlin) of Java & XML, 3rd Edition, published in December 2006.

Read an Excerpt

Chapter 12: SOAP

In this chapter:
Starting Out
Setting Up
Getting Dirty
Going Further
What's Next?

SOAP is the Simple Object Access Protocol. If you haven't heard of it by now, you've probably been living under a rock somewhere. It's become the newest craze in web programming, and is integral to the web services fanaticism that has taken hold of the latest generation of web development. If you've heard of .NET from Microsoft or the peer-to-peer "revolution," then you've heard about technologies that rely on SOAP (even if you don't know it). There's not one but two SOAP implementations going on over at Apache, and Microsoft has hundreds of pages on their MSDN web site devoted to it (http://msdn.microsoft.com).

In this chapter, I explain what SOAP is, and why it is such an important part of where the web development paradigm is moving. That will help you get the fundamentals down, and prepare you for actually working with a SOAP toolkit. From there, I briefly run over the SOAP projects currently available, and then delve into the Apache implementation. This chapter is not meant to be the complete picture on SOAP; the next chapter fills in lots of gaps. Take this as the first part of a miniseries; many of your questions at the end of this chapter will be answered in the next.

Starting Out

The first thing to do is get an understanding of what SOAP is. You can read through the complete W3C note submission, which is fairly lengthy, at http://www.w3.org/TR/SOAP. When you take away all of the hype, SOAP is just a protocol. It's a simple protocol (to use, not necessarily to write), based on the idea that at some point in a distributed architecture, you'll need to exchange information. Additionally, in a system that is probably overtaxed and process-heavy, this protocol is lightweight, requiring a minimal amount of overhead. Finally, it allows all this to occur over HTTP, which allows you to get around tricky issues like firewalls and keep away from having all sorts of sockets listening on oddly numbered ports. Once you get that down, everything else is just details.

Of course, I'm sure you're here for the details, so I won't leave them out. There are three basic components to the SOAP specification: the SOAP envelope, a set of encoding rules, and a means of interaction between request and response. Begin to think about a SOAP message as an actual letter; you know, those antiquated things in envelopes with postage and an address scrawled across the front? That analogy helps SOAP concepts like "envelope" make a lot more sense. Figure 12-1 seeks to illustrate the SOAP process in terms of this analog.

With this picture in your head, let's look at the three components of the SOAP specification. I cover each briefly and provide examples that illustrate these concepts more completely. Additionally, it's these three key components that make SOAP so important and valuable. Error handling, support for a variety of encodings, serialization of custom parameters, and the fact that SOAP runs over HTTP make it more attractive in many cases than the other choices for a distributed protocol.1 Additionally, SOAP provides a high degree of interoperability with other applications, which I delve into more completely in the next chapter. For now, I want to focus on the basic pieces of SOAP.

The Envelope

The SOAP envelope is analogous to the envelope of an actual letter. It supplies information about the message that is being encoded in a SOAP payload, including data relating to the recipient and sender, as well as details about the message itself. For example, the header of the SOAP envelope can specify exactly how a message must be processed. Before an application goes forward with processing a message, the application can determine information about a message, including whether it will even be able to process the message. Distinct from the situation with standard XML-RPC calls (remember that? XML-RPC messages, encoding, and the rest are all wrapped into a single XML fragment), with SOAP actual interpretation occurs in order to determine something about the message. A typical SOAP message can also include the encoding style, which assists the recipient in interpreting the message. Example 12-1 shows the SOAP envelope, complete with the specified encoding....

...You can see that an encoding is specified within the envelope, allowing an application to determine (using the value of the encodingStyle attribute) whether it can read the incoming message situated within the Body element. Be sure to get the SOAP envelope namespace correct, or SOAP servers that receive your message will trigger version mismatch errors, and you won't be able to interoperate with them.

Encoding

The second major element that SOAP brings to the table is a simple means of encoding user-defined datatypes. In RPC (and XML-RPC), encoding can only occur for a predefined set of datatypes: those that are supported by whatever XML-RPC toolkit you download. Encoding other types requires modifying the actual RPC server and clients themselves. With SOAP, however, XML schemas can be used to easily specify new datatypes (using the complexType structure discussed way back in Chapter 2), and those new types can be easily represented in XML as part of a SOAP payload. Because of this integration with XML Schema, you can encode any datatype in a SOAP message that you can logically describe in an XML schema.

Invocation

The best way to understand how SOAP invocation works is to compare it with something you already know, such as XML-RPC. If you recall, an XML-RPC call would look something like the code fragment shown in Example 12-2....

...I've coded up a simple ticket counter-style application. Now, look at Example 12-3, which shows the same call in SOAP....

...As you can see, the actual invocation itself, represented by the Call object, is resident in memory. It allows you to set the target of the call, the method to invoke, the encoding style, the parameters, and more not shown here. It is more flexible than the XML-RPC methodology, allowing you to explicitly set the various parameters that are determined implicitly in XML-RPC. You'll see quite a bit more about this invocation process in the rest of the chapter, including how SOAP provides fault responses, an error hierarchy, and of course the returned results from the call.

With that brief introduction, you probably know enough to want to get on with the fun stuff. Let me show you the SOAP implementation I'm going to use, explain why I made that choice, and get to some code.

Setting Up

Now that you have some basic concepts down, it's time to get going on the fun part, the code. You need a project or product for use, which turns out to be simpler to find than you might think. If you want a Java-based project providing SOAP capability, you don't have to look that far. There are two groups of products out there: commercial and free. As in most of the rest of the book, I'm steering away from covering commercial products. This isn't because they are bad (on the contrary, some are wonderful); it's because I want every reader of this book to be able to use every example. That calls for accessibility, something commercial products don't provide; you have to pay to use them, or download them and at some point the trial period runs out.

That brings us to open source projects. In that realm, I see only one available: Apache SOAP. Located online at http://xml.apache.org/soap, this project seeks to provide a SOAP toolkit in Java. Currently in a Version 2.2 release, you can download it from the Apache web site. That's the version and project I use for the examples throughout this chapter.

Other Options

Before moving on to the installation and setup of Apache SOAP, I will answer a few questions that might be rattling around in your head. It's probably clear why I'm not using a commercial product. However, you may be thinking of a couple of other open source or related options that you might want to use, and wondering why I am not covering those.

What about IBM SOAP4J?

First on the list of options is IBM's SOAP implementation, IBM SOAP4J. IBM's work is actually the basis of the current Apache SOAP project, much as IBM XML4J fed into what is now the Apache Xerces XML parser project. Expect the IBM implementation to resurface, wrapping the Apache SOAP project's implementation. This is similar to what is happening with IBM's XML4J; it currently just provides IBM packaging over Xerces. This makes some additional levels of vendor-backing available to the open source version, although the two (Apache and IBM) projects are using the same codebase.

Isn't Microsoft a player?

Yes. Without a doubt, Microsoft and its SOAP implementation, as well as the whole .NET initiative (covered more in the next chapter), are very important. In fact, I wanted to spend some time covering Microsoft's SOAP implementation in detail, but it only supports COM objects and the like, without Java support. For this reason, coverage of it doesn't belong in a book on Java and XML. However, Microsoft (despite the connotations we developers tend to have about the company) is doing important work in web services, and you'd be making a mistake in writing it off, at least in this particular regard. If you need to communicate with COM or Visual Basic components, I highly recommend checking out the Microsoft SOAP toolkit, found online at http://msdn.microsoft.com/library/default.asp?url=/nhp/Default.asp?contentid=28000523 along with a lot of other SOAP resources.

What's Axis?

Those of you who monitor activity in Apache may have heard of Apache Axis. Axis is the next-generation SOAP toolkit, also being developed under the Apache XML umbrella. With SOAP (the specification, not a specific implementation) undergoing fairly fast and radical change these days, tracking it is difficult. Trying to build a version of SOAP that meets current requirements and moves with new development is also awfully tough. As a result, the current Apache SOAP offering is somewhat limited in its construction. Rather than try to rearchitect an existing toolkit, the Apache folks started fresh with a new codebase and project; thus, Axis was born. Additionally, the naming of SOAP was apparently going to change, from SOAP to XP and then to XMLP. As a result, the name of this new SOAP project was uncoupled from the specification name; thus, you have "Axis." Of course, now it looks like the W3C is going back to calling the specification SOAP (Version 1.2, or Version 2.0), so things are even more confusing!

Think of IBM SOAP4J as architecture 1 of the SOAP toolkit. Following that is Apache SOAP (covered in this chapter), which is architecture 2. Finally, Axis provides a next-generation architecture, architecture 3. This project is driven by SAX, while Apache SOAP is based upon DOM. Additionally, Axis provides a more user-friendly approach in header interaction, something missing in Apache SOAP. With all of these improvements, you're probably wondering why I'm not covering Axis. It's simply too early. Axis is presently trying to get together a 0.51 release. It's not a beta, or even an alpha, really; it's very early on. While I'd love to cover all the new Axis features, there's no way your boss is going to let you put in a pre-alpha release of open source software in your mission-critical systems, now is there? As a result, I've chosen to focus on something you can use, today: Apache SOAP. I'm sure when Axis does finalize, I'll update this chapter in a subsequent revision of the book. Until then, let's focus on a solution you can use.

Installation

There are two forms of installation with regard to SOAP. The first is running a SOAP client, using the SOAP API to communicate with a server that can receive SOAP messages. The second is running a SOAP server, which can receive messages from a SOAP client. I cover installation of both cases in this section.

The client

To use SOAP on a client, you first need to download Apache SOAP, available online at http://xml.apache.org/dist/soap. I've downloaded Version 2.2, in the binary format (in the version-2.2 subdirectory). You should then extract the contents of the archive into a directory on your machine; my installation is in the javaxml2 directory (c:\javaxml2 on my Windows machine, /javaxml2 on my Mac OS X machine). The result is /javaxml2/soap-2_2. You'll also need to download the JavaMail package, available from Sun at http://java.sun.com/products/javamail/. This is for the SMTP transfer protocol support included in Apache SOAP. Then, download the JavaBeans Activation Framework (JAF), also from Sun, available online at http://java.sun.com/products/beans/glasgow/jaf.html. I'm assuming that you still have Xerces or another XML parser available for use....

Table of Contents

  • Preface
  • Chapter 1: Introduction
  • Chapter 2: Constraints
  • Chapter 3: SAX
  • Chapter 4: Advanced SAX
  • Chapter 5: DOM
  • Chapter 6: DOM Modules
  • Chapter 7: JAXP
  • Chapter 8: Pull Parsing With StAX
  • Chapter 9: JDOM
  • Chapter 10: dom4j
  • Chapter 11: Data Binding with JAXB
  • Chapter 12: Content Syndication with RSS
  • Chapter 13: XML As Presentation
  • Chapter 14: Looking Forward
  • Appendix 1: SAX Features and Properties

Preface

XML, XML, XML, XML. You can see it on hats and t-shirts, read about it on the cover of every technical magazine on the planet, and hear it on the radio or the occasional Gregorian chant album .... well, maybe it hasn't gone quite that far yet, but don't be surprised if it does. XML, the Extensible Markup Language, has seemed to take over every aspect of technical life, particularly in the Java community. An application is no longer considered an enterprise-level product if XML isn't being used somewhere. Legacy systems are being accessed at a rate never before seen, and companies are saving millions and even billions of dollars on system integration, all because of three little letters. Java developers wake up with fever sweats wondering how they are going to absorb yet another technology, and the task seems even more daunting when embarked upon; the road to XML mastery is lined with acronyms: XML, XSL, XPath, RDF, XML Schema, DTD, PI, XSLT, XSP, JAXP, SAX, DOM, and more. And there isn't a development manager in the world that doesn't want their team learning about XML today!

When XML became a formal specification at the World Wide Web Consortium in early 1998, relatively few were running in the streets claiming that the biggest thing since Java itself (arguably bigger!) had just made its way onto the technology stage. Barely two years later, XML and a barrage of related technologies for manipulating and constraining XML have become the mainstay of data representation for Java systems. XML promises to bring to a data format what Java brought to a programming language: complete portability. In fact, it is only with XML that the promise of Java is realized;Java's portability has been seriously compromised as proprietary data formats have been used for years, enabling a system to run on multiple systems, but not across businesses in a standardized way. XML promises to fill this gap in complete interoperability for Java programs by removing these proprietary data formats and allowing systems to communicate using a standard means of data representation. This books is a book about XML, but is geared specifically at Java developers. While both XML and Java are powerful tools in their own right, it is their marriage that this book is concerned with, and where the true power of XML lies. We will cover the various XML vocabularies, look at creating, constraining, and transforming XML, and examine all of the APIs for handling XML from Java code. Additionally, we cover the hot topics that have made XML such a popular solution for dynamic content, messaging, e-business, and data stores. Through it all, we take a very narrow view: that of the developer who has to put these tools to work. A candid look at the tools XML provides is taken, and if something is not useful (even if it is popular!) we will address it and move on. If a particular facet of XML is a hidden gem, we will extract the value of the item and put it to use. This book is meant to serve as a handbook to help you, and is neither a reference nor a book geared towards marketing XML.

Finally, the back half of this book is filled with working, practical code. Although available for download, the purpose of this code is to walk you through creating several XML applications, and you are encouraged to follow along with the examples rather than skimming the code. We introduce a new API for manipulating XML from Java as well, and complete coverage and examples are included. This book is for you, the Java developer, and is about the real world, not a theoretical or fanciful flight through what is "cool" in the industry; We abandon buzz-words when possible, and define them clearly when not. All of the code and concepts within this book has been entered by hand into an editor, prodded and tested, and are intended to aid you in your road to mastering Java and XML.

Organization

This book is structured in a very particular way: the first half of the book (Chapters 1 through 7) focus on getting you grounded in XML and the core Java APIs for handling XML. Although these chapters are not glamorous, they should be read through in order, and at least skimmed even if you are familiar with XML. We cover the basics, from creating XML to transforming it. Chapter 8 serves as a halfway point in the book, covering an exciting new API for handling XML within Java, JDOM. This chapter is a must-read, as the API is being publicly released as this book goes to production, and this is the reference for JDOM 1.0 (as I wrote the API with Jason Hunter specifically for solving problems in using Java and XML!). The remainder of the book, Chapters 9 through 14, focus on specific XML topics that continually are brought up at conferences and tutorials I am involved with, and seek to get you neck-deep in using XML in your applications, now! Finally, there are two appendices to wrap up the book. The summary of this content is as follows:

Chapter 1, Introduction

We will look at what all the hype is about, examine the XML alphabet soup, and spend time discussing why XML is so important to the present and future of enterprise development.

Chapter 2, Creating XML

We start looking at XML by building an XML document from the ground up. Examination of the major XML constructs, such as elements, attributes, entities, and processing instructions is included.

Chapter 3, Parsing XML

The Simple API for XML (SAX), our first Java API for handling XML, is introduced and covered in this chapter. The parsing lifecycle is detailed, and the events that can be caught by SAX and used by developers are demonstrated.

Chapter 4, Constraining XML

In this chapter we look at the two ways to impose constraints on XML documents, Document Type Definitions and XML Schema. We will dissect the differences and analyze when one should be used over the other.

Chapter 5, Validating XML

Complementing Chapter 4, this looks at how to use the SAX skills previously learned to enforce validation constraints, as well as how to react when constraints are not met by XML documents.

Chapter 6, Transforming XML

In this chapter the Extensible Stylesheet Language and the other critical components for transforming XML from one format into another are introduced. We cover the various methods available for converting XML into other formats, and look at using formatting objects to convert XML into binary formats.

Chapter 7, Traversing XML

Continuing to look at transforming XML documents, we discuss XSL transformation processors and how they can be used to convert XML into other formats. We also examine the Document Object Model (DOM) and how it can be used for handling XML data.

Chapter 8, JDOM

We begin by looking at the Java API for XML Parsing (JAXP), and discuss the importance of vendor-independence when using XML. I then introduce the JDOM API, discuss the motivation behind its development, and detail its use, comparing it to SAX and DOM.

Chapter 9, Web-Publishing Frameworks

This chapter looks at what a web-publishing framework is, why it matters to you, and how to choose a good one. We then cover the Apache Cocoon frame work, taking an in-depth look at its feature set and how it can be used to server highly-dynamic content over the Web.

Chapter 10, XML-RPC

In this chapter we cover Remote Procedure Calls (RPC), its relevance in distributed computing as compared to RMI, and how XML makes RPC a viable solution for some problems. We then look at using XML-RPC Java libraries and building XML-RPC clients and servers.

Chapter 11, XML for Configurations

In this chapter we look at using configuration data in an XML format, and why that format is so important to cross-platform applications, particularly as it relates to distributed systems.

Chapter 12, Creating XML with Java

Although covered in part in other chapters, here we look at the process of generating and mutating XML from Java, how to perform these modifications from server-side components such as Java servlets, and outline concerns when mutating XML.

Chapter 13, Business-to-Business

This chapter details a "case-study" of creating inter- and intra-business communication channels using XML as a portable data format. Using multiple languages, we build several application components for different companies that all interact with each other using XML.

Chapter 14, XML Schema

We revisit XML Schema here, looking at why the XML Schema specification has garnered so much attention, how reality measures up to the promise of the XML Schema concept, and examining why Java and XML Schema are such complementary technologies.

Appendix A, API Reference

This details all the classes, interfaces, and methods available for use in the SAX, DOM, JAXP, and JDOM APIs.

Appendix B, SAX 2.0 Features and Properties

This details the features and properties available to SAX 2.0 parser implementations.

From the B&N Reads Blog

Customer Reviews