‹ Geoff Ruddock

Python XML w/ ElementTree

Nov 16, 2021
import requests
from xml.etree import ElementTree as et
xml_file_contents = requests.get('https://www.w3schools.com/xml/cd_catalog.xml').content

with open('cd_catalog.xml', 'wb') as f:
    f.write(xml_file_contents)

Reading

Create an ElementTree object from an XML file

tree = et.ElementTree(file='cd_catalog.xml')
tree
<xml.etree.ElementTree.ElementTree at 0x7fad503d3100>
root = tree.getroot()
root
<Element 'CATALOG' at 0x7fad503db3b0>

List (only) child elements

list(root)[0:5]
[<Element 'CD' at 0x7fad503dbd60>,
 <Element 'CD' at 0x7fad503f1130>,
 <Element 'CD' at 0x7fad503f1360>,
 <Element 'CD' at 0x7fad503f1590>,
 <Element 'CD' at 0x7fad503f1810>]

List all elements (of any depth)

# ignore the enumerate() wrapper and if statement here, purely to reduce output.

for i, x in enumerate(tree.iter()):
    if i < 20:
        print(x)
<Element 'CATALOG' at 0x7fad503db3b0>
<Element 'CD' at 0x7fad503dbd60>
<Element 'TITLE' at 0x7fad503dbdb0>
<Element 'ARTIST' at 0x7fad503dbe00>
<Element 'COUNTRY' at 0x7fad503dbe50>
<Element 'COMPANY' at 0x7fad503f1040>
<Element 'PRICE' at 0x7fad503f1090>
<Element 'YEAR' at 0x7fad503f10e0>
<Element 'CD' at 0x7fad503f1130>
<Element 'TITLE' at 0x7fad503f1180>
<Element 'ARTIST' at 0x7fad503f11d0>
<Element 'COUNTRY' at 0x7fad503f1220>
<Element 'COMPANY' at 0x7fad503f1270>
<Element 'PRICE' at 0x7fad503f12c0>
<Element 'YEAR' at 0x7fad503f1310>
<Element 'CD' at 0x7fad503f1360>
<Element 'TITLE' at 0x7fad503f13b0>
<Element 'ARTIST' at 0x7fad503f1400>
<Element 'COUNTRY' at 0x7fad503f1450>
<Element 'COMPANY' at 0x7fad503f14a0>

Pretty print

Writing

from xml.etree.ElementTree import Element, SubElement

Create an element

doc = Element('doc')
doc
<Element 'doc' at 0x7fad18333720>
list(doc)
[]

Attach a sub-element

SubElement(doc, 'section A')
list(doc)
[<Element 'section A' at 0x7fad18333c20>]

Append a sub-element

sub_B = Element('sub_B')
doc.append(sub_B)
list(doc)
[<Element 'section A' at 0x7fad18333c20>, <Element 'sub_B' at 0x7fad503d20e0>]

Further reading

comments powered by Disqus