Online documentation - Websydian v6.5

The structure of XML documents

Introduction
XML as a tree-structure
Example
Using this information
Next document
More information

Introduction

This is the second document in the introduction to TransacXML.

The previous document in the introduction to TransacXML was the description of basic XML-terms.

This document describes how an XML document is structured. This is information you need to know when you want to model, read, or create XML documents.

All XML documents can be described as tree-structures. They all have exactly one top element. The top element can contain any number of attributes and simple elements and it can contain any number of complex elements, which again can contain sub-structures.

When you process an existing XML document, you will often do this by traversing the tree-structure so that you traverse the entire structure, extracting information as you go through the elements and attributes that contain data.

When you create an XML document, you will create the different elements and attributes that make up the final document and combine them into a tree-structure.

To be able to do either of these things effortlessly, you need to understand how an XML document maps to a tree-structure.

Example

This first illustration shows how you can convert an example document to a tree-structure.

The document has the following content:

A short description of the example document

The document contains data that describes a horse race. It contains some basic information about the race, where it is held and a list of the horses that will compete in the race.

The structure of the document is as follows:

The "Race" element is the top element of the document, this contains two attributes "data and "name" and two complex child elements "Course" and "Horses".

The complex "Course" element contains two simple elements "CourseName" and "Address".

The complex element "Horses" contains a number of complex "Horse" elements.

Each complex "Horse" element contains an attribute "Name" and three simple elements "Value", "DateOfBirth", and "Gender".

The tree-structure of the document is determined by the complex elements - and how they are attached to each other.

The following figure shows a representation of the document as a tree-structure:

When you model the XML document as a tree-structure, you can see that there is one top element (Race), which has two child elements (Course, Horses). Horses has a number of child elements (all of the type Horse).

The simple elements and the attributes can be perceived as the data content of the complex elements.

An attribute belongs to the complex element that has the start tag, which the attribute is placed in.
A simple element belongs to the complex element that scopes the simple element.

The following figure shows the data content of each of the complex elements and where in the tree-structure it will be available:

The Race element has a Name (New Years Meet) and a Date (2010-12-31).

The Course Element has a CourseName (The new track) and an address (Track Road 123).

The Horse element contains no data - it only contains the list of Horse elements.

Each Horse element has a Name, a Value, a DateOfBirth, and a Gender.

For the first Horse element, the values are: Bonfire, 5000, 1988-01-02, M.

Using this information

The tree-structure of the XML document is very important for three different usages:

1. When you model an XML document, you actually model the tree-structure.

2. When you create an XML document, you create each complex element - and add it to the parent complex element. In this way you build the document as a tree-structure.

3. When you read an XML document, you traverse the complex elements in the tree-structure from the top, and read the data for each complex element.