A key requirement for dissecting nearly any XPath expression is an understanding of Location Paths, which select one or more nodes based on their location or other properties. A Location Path consists of a number of individual Location Steps, each separated by a slash (/). Each individual step builds upon the previous steps to traverse the document, and can be a test against the name of a node, or one of the following special tests:
Another special test is *, which will match any element node (or attribute node within the attribute axis, or namespace node within the namespace axis.) Similarly, another special test prefix:* will match any node identified with the namespace mapped to prefix.
Figure 3.2, “Location paths and steps” illustrates how a path is traversed in steps, from left to right.
Extra care is needed when traversing a document that contains XML namespaces, especially with defaulted namespaces. Any namespace prefixes in scope can be used in Location Steps; however, default namespaces in scope do not apply to the XPath expression.
<purchaseOrder xmlns="http://po.example.org"> <items> <item/> <item/> <item/> </items> </purchaseOrder>
The prefixes used in the XPath expression are not required to match the prefixes used in the XML being addressed, since only the combination of local name plus namespace URI matters. Because of the way XPath treats (or doesn't treat, actually) default namespaces, the namespace http://po.example.org needs to be mapped to a specific prefix, regardless of any default namespace in scope at the point of the XPath expression. XForms markup to accomplish this might look like this:
<xforms:bind nodeset="po:purchaseOrder/po:items/po:item" xmlns:po="http://po.example.org"/>
A context node
A pair of non-zero positive integers (the context position and the context size)
Variable bindings (not used in XForms)
A function library
The set of namespace declarations in scope for the expression
Remember, one thing in XPath that is not part of the context is a default namespace. The developing XPath 2.0, however, promises to change this.
An XPath expression of . selects the context node, and .. selects the parent node of the context node. Any expression that begins with / is an absolute path and independent from the context node. Other, more complicated, paths through the Data Model are possible, since there are many possible ways or axes in which to navigate through XML.
Each Location Step includes an axis, which is an instruction on how to navigate the tree structure. Since XPath is a general-purpose language, there are many axes that don't make much sense for XForms, but Table 3.1, “XPath axes ” summarizes all of them for completeness.
Table 3.1. XPath axes
Contains the children of the context node. (As a consequence, this axis never contains attribute or namespace nodes.)
Contains the attributes of the context node. (As a consequence, this axis is always empty, unless the context node is an element.)
Contains the parent of the context node. (As a consequence, this axis is always empty when the context node is the root node.)
Contains the descendants of the context node; a descendant is a child or a child of a child and so on. (As a consequence, this axis never contains attribute or namespace nodes.)
Contains the context node and the descendants of the context node. (As a consequence, this axis never contains attribute or namespace nodes.)
Contains the ancestors of the context node; the ancestors of the context node consist of the parent of context node and the parent's parent and so on. (As a consequence, this axis always includes the root node, unless the context node itself is the root node.)
Contains the context node and the ancestors of the context node. (As a consequence, this axis always includes the root node.)
Contains all nodes in the same document as the context node that are after the context node in document order, excluding any descendants and excluding attribute nodes and namespace nodes.
Contains all the following siblings of the context node. (As a consequence, this axis is always empty when the context node is an attribute node or a namespace node.)
Contains all nodes in the same document as the context node that are before the context node in document order, excluding any ancestors and excluding attribute nodes and namespace nodes.
Contains all the preceding siblings of the context node. (As a consequence, this axis is always empty when the context node is an attribute node or a namespace node.)
Contains the namespace nodes of the context node. (As a consequence, this axis is always empty unless the context node is an element.)
Contains just the context node itself.
A few more useful abbreviations are . for self::node( ); .. for parent::node( ); and // for /descendant-or-self::node( )/. This last abbreviation is useful when you want to select every element by a certain name, regardless of where it appears in the tree. For example, //p selects every p element no matter where it occurs.
A bare Location Path expression selects every node that matches the path it specifies. For example, the expression html/body/p selects every p element that's a direct child of body. Often, it is desirable to filter down the selection even more. In XPath, this is done through a predicate, which appears in square brackets and can apply to any Location Step in a Location Path. To select only the first p from the earlier example, the expression would be /html/body/p[position( )=1], or, even shorter, /html/body/p.
The predicate expression evaluates to a true or false result (or a number that is special-cased as a comparison against position( )). The way this works is:
A new context is created with the node it is attached to as the context node. The number of nodes in the node-set at that point is the context size.
For each node, a new context is created with the node it is attached to as the context node, with the number of nodes in the node-set at that point as the context size, and with the context position being the position of the node within the node-set. For axes that go "backwards," namely ancestor, ancestor-or-self, preceding, and preceding-sibling, the position counter is mirrored so that, for instance, ancestor::element would select the 2nd ancestor element as expected.
If the expression evaluates to true, the node is included in the filtered node-set.
Multiple predicates can be specified on any given step, which will result in multiple layers of filtering on the node-set. If a node-set gets filtered down to nothing, no error results—the expression simply returns an empty node-set. Figure 3.3, “Node-set filtering with predicates” illustrates how filtering works through predicates.