XML with Python

Posted by Kyle R. Conway - January 31, 2019

General Notes

XML

  • Shares structured data.
  • Start Tag & End Tag
    • <contact>
    • </contact>
  • Text Content
  • Attributes (e.g. <email hide="yes" />
    • Key="string"
  • Can use self-closing Tags
  • Whitespace unimportant
  • Indents unimportant
  • Tree-like structure
    • Text is child of node.
      • <parent>child</parent>
    • Attribute is also child of node.
      • <parent type="son">child</parent>
  • XML Paths:
    • grandparent/parent/child
    • <grandparent>
      • <parent>
        • child
      • </parent>
    • </grandparent>

XML Schema

Likely *.XSD format via W3C standard

  • Contract is in XML format
  • Describes:
    • tags
    • format
    • structure
    • datatypes (e.g. integer)
  • Used to validate XML / transfer between machines

Format

  • xs:element
    • xs:complexType
      • xs:sequence
        • <xs:element name = "weeks" type="xs:interger"/>

Time

YYYY-MM-DDTHH:MM:SSZ

Parse XML in Python

    import xml.etree.ElementTree as ET

data = file.xml

tree = ET.fromstring(data)
print('Name:',tree.find('name').text)
print('Attr:',tree.find('email').get('hide'))

# Find Multiple
lst = tree.findall('users/user')
print('User count:', len(lst))
for item in lst:
print('Name', item.find('name').text)
print('Id', item.find('id').text)
print('Attribute', item.get("x"))