You are reading O'Reilly XForms Essentials by Micah Dubinko. (What is this?) - Buy XForms Essentials Online

Chapter 3. XPath in XForms

"Nobody trips over mountains. It is the small pebble that causes you to stumble. Pass all the pebbles in your path and you will find you have crossed the mountain."

—Traditional proverb

The most obvious difference between XForms and earlier technologies is the representation of form data as XML instead of flat name/value pairs. While a richer data representation was a welcome change, it also called for a more sophisticated language to reference structured data. The W3C had already defined just such a language, called XPath (http://www.w3.org/TR/xpath), a component of XSLT (http://www.w3.org/TR/xslt), an XML vocabulary used for transforming one flavor of XML into another. The XPath specification was built with the intention that later specifications could use it as a foundation, which is exactly what XForms does. This chapter first lays out the foundation of XPath, and then shows how XForms builds on that foundation.

What exactly is XPath? The "path" portion of the name comes from the similar appearance of many XPath expressions to directory paths in a filesystem, as shown in Example 3.1, “Some XPath expressions ”. XPath also includes some lightweight calculation functionality, such as basic mathematics, rounding, and string manipulation, which the calculation engine in XForms takes advantage of instead of defining a new (and incompatible) language.

Each of these examples demonstrates a particular aspect of how XPath is used for addressing parts of an XML document. But must the XML always exist as a distinct document? No. The data structure addressed by XPath is carefully defined—by the XPath Data Model. Detailed knowledge of the data model isn't required to start using XPath, though. A few basic concepts are all that is needed to begin.

The remainder of this chapter after this section serves as a detailed XPath reference. In many cases, however, only a basic level of XPath is needed in XForms. (Chapter 10, Form Accessibility, Design, and Troubleshooting shows one common design pattern for forms that requires virtually no special XPath knowledge.) If you are new to XPath, this section will provide the necessary background that will enable you to read and write simple XPath expressions with confidence.

Simple XPath expressions resemble file system paths, except that instead of navigating across directories and files, XPath expressions navigate across XML nodes—the XPath term for any individual piece of XML such as an element, attribute, or piece of text. For example, the expression:

/html/head/title

represents an absolute path through XML, starting at a special root node, then progressing through child elements html, head, and title. The XML referenced by this path might look something like this:

<html>
  <head>
    <title>Push Button Paradise</title>
...

Since XML names can be qualified with a namespace, it's also possible to use colonized names at any step. Relative paths are also possible, in which case it's important to know what the context node (similar in concept to the current directory) is. Additionally, attributes can be addressed with a leading @ character, leading to XPath expressions like this:

html:head/xforms:model/@id

Note that when the leading slash is omitted, the path expression is relative.

Path expressions can be said to return a node-set. Both of the above examples conveniently returned a node-set consisting of a single node, but in the general case, node-sets can have zero, one, or a multitude of nodes. XForms includes a first node rule, that in certain circumstances, will reduce a larger node-set down to a single node, namely, the first one according to the order the elements appear in the document. Also, node-sets can be filtered manually using a predicate. Predicates are identified using square brackets as follows:

purchaseOrder/items/item[3]

This expression is processed by first selecting all item nodes that are children of an items node (which, in turn, must be a child of a purchaseOrder node, which, in turn, must be a child of the context node...whew!). The resulting node-set is then filtered to include only the third node, in the order that the elements appear in the document. If there is no third item node, then the result will be an empty node-set, not any kind of an error condition.

XPath expressions can also be more than just paths, and can be thought of as a kind of lightweight scripting language. Besides node-sets, an expression can evaluate to a Boolean value, a string, or a number. For example, the expression:

string-length('hello world')

would always return 11 as a number, and the expression:

purchaseOrder/subtotal * instance('taxtable')/tax

represents a full-blown calculation that might appear in a real-world form. On the right-hand side of the multiplication symbol, note that the path expression begins with a function call that can return a node-set from another location (a different XForms instance, in this case).