Wednesday, July 7, 2010

Xpath By Attribute

In Python, the ElementTree module is quite handy for parsing XML documents and strings. The module also has limited support for XPath query strings. This is also any because it allows us to retrieve elements from the parsed XML tree without the need to traverse it.

However, trying to query by attribute doesn't really work as expected. This is too bad because it would be really handy if it did work. The following is a sample that follows the expected XPath syntax, but raises a SyntaxError exception.
from xml.etree import ElementTree as ET

XML = """
<xml>
<person>
<name first="FirstName1" last="LastName1"/>
</person>
<person>
<name first="FirstName2" last="LastName2"/>
</person>
</xml>
"""

if __name__ == '__main__':
tree = ET.fromstring(XML)
print tree.find('.//person/name[@first="FirstName2"]')