Codenotes for Xml

Codenotes for Xml

by Gregory Brill
Codenotes for Xml

Codenotes for Xml

by Gregory Brill

eBook

$9.99 

Available on Compatible NOOK Devices and the free NOOK Apps.
WANT A NOOK?  Explore Now

Related collections and offers


Overview

CodeNotes provides the most succinct, accurate, and speedy way for a developer to ramp up on a new technology or language. Unlike other programming books, CodeNotes drills down to the core aspects of a technology, focusing on the key elements needed in order to understand it quickly and implement it immediately. It is a unique resource for developers, filling the gap between comprehensive manuals and pocket references.

CodeNotes for XML
is a practical handbook for Java and Visual Basic developers interested in working with XML. You will learn how to leverage both CSS and XSLT to produce rich, compelling output, as well as manipulate XML using the DOM and SAX APIs. The new XML Schema specification is also covered in-depth. Companion articles on www.codenotes.com cover XML development with Perl, integrating XML with databases, important grammars such as XHTML and SOAP, and much more. CodeNotes for XML is your guide to these powerful technologies, presented within the context of the distributed application, database, or web-based world you already know.

This edition of CodeNotes includes:

-A global overview of a technology and explanation of what problems it can be used to solve
-Real-world examples
-"How and Why" and "Bugs and Caveats" sections that provide hints, tricks, workarounds, and tips on what should be taken advantage of or avoided
-Instructions and classroom-style tutorials throughout from expert trainers and software developers.

Visit www.codenotes.com for updates, source code templates, access to message boards, and discussion of specific problems with CodeNotes authors and other developers.

Every CodeNotes title is written and reviewed by a team of commercial software developers and technology experts. See "About the Authors" for more information.

Product Details

ISBN-13: 9780679647287
Publisher: Random House Publishing Group
Publication date: 01/23/2002
Series: CodeNotes
Sold by: Random House
Format: eBook
Pages: 256
File size: 872 KB

About the Author

Gregory Brill is the series editor of CodeNotes and the founder and president of Infusion Development Corporation, a technology training and consulting firm that specializes in architecting securities trading and analytic systems for several of the world’s largest investment banks. He has written for C++ Users Journal, and is the author of Applying COM+. He lives in New York.

Read an Excerpt

Chapter 1: Introduction
Orientation

What Is XML?

Extensible Markup Language (XML) is a globally accepted, vendor- independent standard for representing structured, text-based data. An XML document is a perfect medium in which to encapsulate any kind of information that can be arranged or structured in some way. For example, an XML document can contain a list of personal or business contacts, books in a library’s card catalogue, or products in a warehouse.

If we looked at any one of these examples—say, the library card catalogue—in the more traditional “table-oriented” view with which most developers would be familiar, we would see something like the following:

book_isbnbook_genreFirstnamemiddlenameLastnametitle

0812589041Science OrsonScottCardEnder’s FictionGame

0883853280biographyWilliamDunhamEuler: The

Master of Us All


An XML document, on the other hand, would present this information hierarchically, where the column names would become tags or possibly “attributes.” For example:

<?xml version="1.0"?>

<books>

<book>

<isbn>0812589041</isbn>

<genre>science fiction</genre>

<author>

<firstname>Orson</firstname>

<middlename>Scott</middlename>

<lastname>Card</lastname>

</author>

<title>Enders Game</title>

<year>1985</year>

</book>

<book>

<isbn>0883853280</isbn>

<genre>biography</genre>

<author>

<firstname>William</firstname>

<middlename/>

<lastname>Dunham</lastname>

</author>

<title>Euler The Master of Us All</title>

<year>1999</year>

</book>

</books>



Listing 1.1 is included to give you a first look at XML, which can be overwhelming compared to a familiar table structure. However, as you become more familiar with XML, you will see that this structure has many important advantages over a traditional table.

XML and HTML

It can help to think of XML at its most basic level as being very similar to a HyperText Markup Language (HTML) web page. However, the tags in an XML document do not have a fixed meaning the way they do in HTML (e.g., <bold>, <body>, etc.) When a developer writes an XML document, he or she decides on the names of the elements (e.g., book, title, year, and author) and the data the elements will contain (e.g., the <year> tags contain the year the book was published). The developer chooses his elements with the expectation that some client application exists that will read the XML file and be written to process those particular elements in some way. Referring back to Listing 1.1, one can imagine that there is a book-search software application running on a computer in the library that reads XML files with this structure (perhaps receiving them via the Web from some central server), allowing library patrons to search for the books they wish to check out.

What Is XML Used For?

One misconception regarding XML is that it is simply an alternate way of transporting and storing data. However, that is only one small facet of how XML is used today. To give only a few examples, XML can be used to:

-invoke methods on a remote server through a firewall (this protocol is called SOAP)

-represent relational database data such that it can be easily translated into HTML, viewable by any browser without programming

-store configuration and deployment data for applications, providing operating-system-independent formats for initialization/configuration files

-create template documents describing the various fields and attributes of a business form

XML Tools and Technologies

In spite of its power and wide range of uses, XML itself is very straightforward. The more subtle aspects of XML do not have to do with XML itself, but rather with various third-party applications and technologies such as XML editors and authoring tools, and XML-related APIs.

XML Authoring Tools

XML files can become very large and may have many layers of nested elements. While the basic grammar of XML is relatively simple, finding a deeply buried element in a large document, or resolving a missed “/” or mismatched tag will very quickly try the patience of most software developers. Therefore, many tools have been developed to address this need. One can, of course, work with any simple text editor, but you can find a listing of some popular XML authoring tools in Chapter 3.

Translation and Styling

In addition to applications that make writing XML documents easier, there are a number of technologies that can actually extend XML’s capabilities. Most web browsers are capable of displaying XML files. Internet Explorer, for example, will show an XML file as a dynamic collapsible tree much like a simple Windows File Explorer; you can click on nodes to open them and reveal their child elements, or close them to get to a top-level view of the XML document. Suppose, however, you would like your XML document to display just like a web page with proper formatting and, perhaps, a colored background? There are two ways to do this:

1. Cascading Style Sheets (CSS): CSS files are text files containing format information. CSS is an older technology developed for HTML and uses a specialized scripting language.

2. Extensible Stylesheet Language Transformations (XSLT): Where CSS files are written in a specialized script language, XSLT documents are actually written in XML. An XSLT file maps XML elements into HTML tags and, in so doing, an XSLT file is used to actually translate an XML file into an HTML file.

The separation of data and presentation using either technology allows for a much cleaner and more efficient application design.

Querying

While the XML specification defines a structure for encapsulating data, it does not have any prescribed method for querying the data in an XML document. A technology known as XPath, however, does provide a mechanism for querying an XML file. If you are familiar with relational databases, you can think of XPath as XML’s much less sophisticated brand of SQL (Structured Query Language). By way of example, the XPath expression [/stocks/stock[1.0 > @price/text()]] will return all penny stocks from an appropriately structured XML file.

Programming with XML

In order to read an XML document, an application must parse it. The process of parsing is complex; a parser must take a text document, cut it up into meaningful segments (while making certain that these segments are correctly formed and that they conform to the rules of the language), and store the data and elements in memory. Writing a proper parser for any language or grammar is no easy task, and XML would not have caught on as a standard unless there also existed a freely available parser with a friendly programming interface to relieve the developer of this task. Fortunately, there are two: The Simple API for XML (SAX) and the Document Object Model (DOM). Ultimately, a developer can use either API to read an XML file and extract data from it. The approach taken by each of these two APIs, however, is different, as follows:

-DOM is a passive API. DOM reads an entire XML document, creates a tree structure in memory, and gives the developer read and write access to this tree. DOM must process the entire file and bring it into memory before a developer may access it.

-SAX is an active API. SAX will actually call methods on your application (or fire events) as it moves through the XML document. You can think of SAX as adhering to an event-driven model, triggering events in your application whenever it encounters anything important your application needs to know about, such as an element, text data, etc. Note that SAX does not allow modification of the XML document.

DOM and SAX are two different APIs useful for parsing, reading, and (to a small extent in the case of DOM) manipulating XML. Each API has strengths and weaknesses that will be explained in Chapters 6 and 7.

Integrating XML with Your World

XML’s popularity is due to the sheer usefulness of the technology. By defining a standard, vendor-independent format that can represent any kind of data, the uses for XML are boundless. Database vendors have taken notice of XML and are making their systems XML-friendly.

You may recall from the earlier section in this chapter, “Translation and Styling,” that an XML file can easily be translated to HTML via XSLT. The simplicity of this translation technology makes XML an ideal way to return data from a database since it can so easily be translated into a web-based report. CSS or additional XSLT can then be used to further enhance the appearance of the HTML page. As we will see, database vendors are quick to take advantage of these capabilities.

Upcoming XML Technologies

The XML family of technologies represents a tremendously fast-moving field where products, capabilities, and interoperability change daily. New standards are on the way for areas such as vector graphics (SVG), distributed computing (SOAP), and changes to HTML (XHTML). In addition, many industries have embraced XML as a standard for communicating specific types of information. For example, the financial industry is slowly accepting FIXML as a standard for transmitting financial information between institutions. In the next few years, you should expect to see many more standards that are based directly on XML.

Table of Contents

Using CodeNotesvii
About the Authorsix
Acknowledgmentsxi
Chapter 1Introduction3
Orientation3
Road Map8
Additional Material9
About the Vendor10
Summary11
Chapter 2Installation12
Hardware12
The Plan12
Installation Procedures13
Chapter 3XML Essentials18
Simple Application19
Core Concepts19
Topic: Basic Syntax21
Topic: Well-Formed XML25
Topic: Other XML Syntax29
Topic: Namespaces33
Topic: DTDs41
Chapter Summary52
Chapter 4Styling with CSS53
Simple Application54
Core Concepts57
Topic: Selecting Nodes57
Topic: CSS Properties64
Chapter Summary75
Chapter 5XSLT and XPath76
Simple Application77
Core Concepts79
Topic: Basic XSLT84
Topic: Formatting with XPath98
Topic: Controlling Output106
Topic: Sorting and Filtering111
Topic: Working with Templates121
Chapter Summary126
Chapter 6Programming with DOM127
Simple Application128
Core Concepts130
Topic: Document Navigation136
Topic: Document Manipulation146
Chapter Summary155
Chapter 7Programming with SAX157
Simple Application158
Core Concepts163
Topic: Introduction to SAX165
Topic: Stack-Based Processing178
Topic: Features184
Topic: Error Handling185
Topic: Namespaces190
Chapter Summary191
Chapter 8XML Schemas193
Simple Application194
Core Concepts196
Topic: Basic Schema Design198
Topic: Elements and Attributes205
Topic: Data Types213
Topic: Practical XML Schema Features222
Chapter Summary230
Index233
From the B&N Reads Blog

Customer Reviews