Writing XML

This chapter describes how user-defined structure objects can be written to a data channel in XML format.

General Principles

The xmlwrite command is used to write xml data to a channel. There is no command to write the output directly to a file, so if we want to do this we must open a channel to the file first.

open mychannel,-of=myfile.xml
xmlwrite mychannel,myobject
mychannel.close

This will write the contents of the user-structure object myobject.

When Fire is running as a back-end service (aka FireRender), a common requirement is to return xml to the client who made the request to the service. This can be done by writing the xml output to the already-open return channel firerender_channel. The client is typically a browser, so before sending the xml we would invariably set the content-type of the return data, informing the browser that the data is xml:

write firerender_channel,'Content-Type: text/xml'
xmlwrite firerender_channel,myobject

By default all xml output is prefaced with the standard xml header:

<?xml version="1.0" ?>

If you wish to suppress output of this header, perhaps because you wish to control the preamble to the data by adding some leading comments into the output, this can be done by adding a -no_h switch to the xmlwrite command, but then you must manually write the header, e.g.

write mychannel,"<?xml version="1.0" ?>"
write mychannel,"<!-- This is a comment -->"
xmlwrite mychannel,myobject,-no_h

The xmlwrite command does not add an encoding attribute to the header, but you can add one yourself by using the -enc switch, e.g.

xmlwrite mychan,myobject,-enc="iso-8859-1"

which would output the following header:

<?xml version="1.0" encoding="iso-8859-1" ?>

Before we discuss how object values are written to xml, mention should be made of the root tag. The root tag is the outermost data tag in the xml output and is controlled slightly differently from its sub-elements.

Consider the following xml output:

<?xml version="1.0" ?>
<rss>
   <title>RSS Output</title>
</rss>

In this example, the root tag name is rss.

The xmlwrite command will use the class name of the object being written as the tag name, so assuming we were writing an object (e.g. myobject) of class ~myatable.rss, then the above output would be generated. Any atable prefix, in this case ~myatable is dropped when deriving the tag name.

However, our class name might not be the required output root tag name, e.g. our class name might be ~myatable.rss_t, which is typical since conventionally we suffix structure classes with "_t" for readability. In this case we can define a different root tag for output purposes in one of 2 ways:

Define our required root tag by a switch on the xmlwrite command, e.g.
xmlwrite mychannel,myobject,-tag="rss"
This will specify a change of tag for the output of this object only.

or, we can attach a different root tag to the class during its definition, e.g.
class ~myatable.rss_t,-tag="rss" {
   string title
}
This will ensure that the change of tag is effective for all xmlwrite commands involving objects of class ~myatable.rss_t.

We will see later how we can add xml namespace, schema and other attributes to the root tag if required, but we will first discuss data values and how they are written.

Element Data Output

Data output is pretty much what one would expect. If a Fire structure member is defined as an xml attribute then its tag and value are attached to the output of its parent. If a structure member is not defined as an xml attribute, then it is considered to be an xml element and its value is output within its own opening and closing tags.

Consider a simple Fire structure:

class ~myatable.rss_t,-tag="rss" {
   string title
   string category
   numeric area[]
   xml attribute {
      string country
      numeric version[]
   }
}

This class has 5 members: 3 of which (title, category, death_toll) are treated as xml elements, and 2 of which (country, version) are treated as xml attributes when processed by the xmlwrite command. There are 2 ways of specifying that members are xml attributes and we could have written the definition like this:

class ~myatable.rss_t,-tag="rss" {
   string title
   string category
   numeric death_toll[]
   string country,-att
   numeric version[],-att
}

If we now create and populate an object of this class:

~myatable.rss_t myrss
myrss.title = 'Earthquake Report'
myrss.category = 'Disasters'
myrss.country = 'Indonesia'

and output it using the xmlwrite command:

xmlwrite mychannel,myrss

the following xml would be produced:

<?xml version="1.0" ?>
<rss country="Indonesia">
   <title>Earthquake Report</title>
   <category>Disasters</category>
</rss>

You will see that the structure members death_toll (an xml element) and version (an xml attribute) are missing from the output. They are optional xml values so have been declared in the structure as variable length arrays. Consequently, because they have not been assigned values their output is skipped.

Another way of excluding a value from xml output, but permanently, is to declare the Fire structure member as "to be ignored". This is useful if you want to have object members for Fire purposes only. In the last example we could have declared an additional member:

numeric count,-ign

This value would never be written to xml because of the -ign switch. Note: members declared as static are by default assumed to be excluded from xml output, members not declared as static are by default assumed to be included.

If a value is obligatory but has a null value, then it is considered "nil" in xml jargon and will be output as a value-less tag if marked nillable in the Fire structure. For example, in our class definition, had we defined the category member like this:

string category,-nil

and had we not populated its value as 'Disasters' in the myrss object but left it as an empty string, then its xml output would have been this:

<category/>

Only members of data type string, time and blob can have this behavior.

By default, structure members use their member name as the xml tag name, but this can be redefined within the class definition via a -tag switch on relevant members. Remember that all identifier names are considered caseless so if you want upper case characters in the xml tag name, the -tag switch must be used. Similarly if the xml tag name has characters prohibited from Fire identifier names, such as '-' or spaces, for example:

numeric death_toll[],-tag='Death Toll'

This will produce the following xml output:

<Death Toll>2307</Death Toll>

Array Values

We have just seen how variable-length array structure members can be used to represent optional xml elements. Where an array has no values set, i.e. has an array length of 0, no xml values are output.

So what happens when a structure member array has multiple values ?
This depends on how the member is defined. There are 3 cases:

1. Default member definition: multiple values are output in multiple tags.

2. Member is defined as an xml attribute (-att or xml attribute { ... }): only the first value is output as a tag attribute.

3. Member is defined as an xml list (-list): multiple values, for both xml elements and attributes, are output as a space-separated list of values.

To demonstrate these cases, consider the following structure and populated object:

class ~myatable.people_t,-tag="people" {
# XML attributes ...
   xml attribute {
      string title[]
      string gender_text[],-lis
      string marital_text[],-lis
   }
# XML elements ...
   string name[]
   numeric ages[],-lis
   numeric genders[],-lis
   numeric maritals[],-lis
}

~myatable.people_t people
people.title = 'Staff'
people.gender_text = <'male', 'female'>
people.marital_text = <'single', 'married'>
people.name = <'Andrea Calderwood', 'Vladimir Borodin', 'Dawili Gonga'>
people.ages = <22,35,31>
people.genders = <2,1,1>
people.maritals = <1,2,1>

The xml output of this object would be something like the following:

<?xml version="1.0" ?>
<people title="Staff" gender_text="male female" marital_text="single married">
   <name>Andrea Calderwood</name>
   <name>Vladimir Borodin</name>
   <name>Dawili Gonga</name>
   <ages>22 35 31</ages>
   <genders>2 1 1</genders>
   <maritals>1 2 1</maritals>
</people>

In this example we have not defined the staff names as list output (-lis) because of embedded whitespace in the values. This could confuse an xml reader which would tokenize the result, thereby mixing up the sequence of forenames and surnames.

Data Coercion

When converting from Fire internal values to xml text output, most data types behave as one would expect, but there are various options, defined by more element switches, for finer tuning. These are particularly useful when dealing with numeric and blob data.

String Data

Fire string values when defined as attributes are output "as is" within double quotes, with double quote characters themselves escaped ("\"") where necessary.

However, when defined as elements (the default definition for a structure member) more characters have to be escaped to avoid clashing with xml language characters such as & < and >. If we consider a string value "<hello>" then this would be output something like this:

<whatever>&lt;hello&gt;</whatever>

You do not need to worry about doing any escaping of characters manually, the encoding is done for you automatically.

Sometimes though, for readability, you don't want to see all these escape characters in the xml, but you want to see the actual text. This can be done by specifying that a string member be output with unescaped (raw) text. A -cd switch must be used for this purpose and any member with this behavior defined has its values output with no character-escaping, although the xml output gets marked appropriately for subsequent reading by xml readers.

This CDATA form of output is particularly common for blocks of text. Consider a string structure member which contains HTML output, with lots of < and > characters. It would be defined like this:

class ~myatable.html_t,-tag="html" {
   string code,-cd
}

and perhaps populated like this:

~mytable.html_t myhtml
myhtml.code = "<html><body>\n<p>This site is under construction</p>\n</body></html>"

When this gets written to xml, the string value of (myhtml.code) would produce the following output:

<html>
   <code><![CDATA[<html><body>
<p>This site is under construction</p>
</body></html>]]></code>
</html>

with the sequences <![CDATA[ and ]]> enclosing the raw text. To reiterate, these sequences are added automatically for you so you do not have to worry about coding them manually.

Blob Data

Blob data falls into one of 2 categories: text and binary.

Text blobs are treated the same way as strings defined with the -cd switch to make the output more readable.

In contrast, binary blobs must be encoded into printable characters prior to output., because xml can contain only printable characters. You have the option of encoding binary blob data into hexadecimal characters or base-64 encoding. Base-64 encoding is more efficient as it produces fewer characters. Both output formats are indicated by another member definition switch (-hex or -b64).

As an example, we will take a structure containing 3 blobs, all with the same data. This data will then be output using the 3 definable output formats: raw CDATA text, hexadecimal and base64-encoded. Typically you would not use either of the last 2 formats for text blobs.

class ~myatable.blobs_t,-tag="mydata" {
   blob text_data
   blob hex_data,-hex
   blob b64_data,-b64
}

Create a blob and a structure object, populating is 3 structure members with the same data:

blob b = textblob("<html><body>\n<p>This site is under construction</p>\n</body></html>")
~myatable.blobs_t mydata
mydata.text_data = b
mydata.hex_data = b
mydata.b64_data = b

Writing this to xml, we get this:

<mydata>
   <text_data><![CDATA[<html><body>
<p>This site is under construction</p>
</body></html>]]></text_data>
   <hex_data>3C68746D6C3E3C626F64793E0A3C703E54686973207369746520697320756E64657220636F6E737472756374696F6E3C2F703E0A3C2F626F64793E3C2F68746D6C3E</hex_data>
   <b64_data>PGh0bWw+PGJvZHk+CjxwPlRoaXMgc2l0ZSBpcyB1bmRlciBjb25zdHJ1Y3Rpb248L3A+CjwvYm9keT48L2h0bWw+</b64_data>
</mydata>

Numeric Data

Within Fire, there is no distinction made between different forms of numeric data, e.g. decimal, integer, boolean, they are all held internally as simply numeric.

When it comes to xml output, numerics can be treated as decimal (floating point), integer, boolean or custom-formatted. The default output is decimal where 8 places of decimals are used (if necessary) to represent the value.

The most common variations are integer and boolean, where the values get rounded before output. These output forms are specified by -int or -boo at member definition time.<
For more complex output you can define your own format by supplying a C-printf type format string.

Consider a structure with 4 different numeric members:

class ~myatable.numbers_t,-tag="numbers" {
   numeric d
   numeric i,-int
   numeric b,-boo,-att; # An XML attribute
   numeric f,-fmt='%012.6f'
}

and a populated object of this class:

~myatable.numbers_t nums
nums.d = 579.3332
nums.i = 472.3
nums.b = true
nums.f = 35.719

When written to xml, the result would be as follows:

<numbers b="1">
   <d>579.3332</d>
   <i>472</i>
   <f>00035.719000</f>
</numbers>

Time Data

By default, Fire stores time/date values with both a time and a date component, but XML has the facility of storing date-only or time-only values. The 3 xml datatypes supported are date, time and dateTime. The switches -tdo and -tto are therefore available to ensure that the correct granularity for such data is maintained during input and output.

Other xml time and date data types are not currently supported so must be represented as strings.

The text form of times and dates accepted by xml parsers is slightly unreadable, e.g. 2002-10-10T12:00:00-05:00 represetnts noon on 10 October 2002, Eastern Standard Time in the U.S.
The full specification can be found here.

To demonstate, let us create a structure with 3 different time members:

class ~myatable.times_t,-tag="times" {
   time tfull;      # XML datatype: dateTime
   time tdate,-tdo; # XML datatype: date
   time ttime,-tto; # XML datatype: time
}

and a populated object of this class, with the same date/time value for all class members:

~myatable.times_t times
times.tfull = '17-Mar-2007, 11:00'
times.tdate = '17-Mar-2007, 11:00'
times.ttime = '17-Mar-2007, 11:00'

When written to xml, the result would be as follows:

<times>
   <tfull>2007-03-17T11:00:00Z</tfull>
   <tdate>2007-03-17Z</tdate>
   <ttime>11:00:00Z</ttime>
</times>

Note that when reading time values from xml into Fire, a variety of text formats are acceptable, but because xml files typically are read by a variety of 3rd party applications Fire only writes time values to xml in the official time/date format as above.

Point Data

Fire stores point values internally in 3-D (x, y and z). Although xml schemas do not have an inbuilt point data type, points are often available as decimal lists, with most xml applications supplying and accepting x,y ordinates only.

To accommodate 2-D points, a point member of a structure can be marked at definition time as 2-D only by using a -p2d switch.
With this switch, 2 ordinates are output space-separated, e.g. "10 20".
Without this switch, all 3 ordinates are output to xml within parentheses, e.g. "(10,20,30)".

Consider this:

class ~myatable.points_t,-tag="points" {
   point p3
   point p2,-p2d
   point p2array[],-p2d,-lis
}

and a populated object of the class:

~myatable.points_t mypts
mypts.p3 = (10,20,30)
mypts.p2 = (10,20)
mypts.p2array = <(10,20), (12,20), (15,30), (0,30)>

When written to xml, the result would be as follows:

<points>
   <p3>(10,20,30)</p3>
   <p2>10 20</p2>
   <p2array>10 20 12 20 15 30 0 30</p2array>
</points>

Structure Hierarchies

When Fire structures contain other Fire classes, rather than simple data types, the output of such classes to xml behaves as one would expect, recursing down the hierarchy.

Consider a structure describing a web server and executable services within it:

class ~myatable.service_t {
   string scheme
   string context
   string password[]
}

class ~myatable.webserver {
   xml attribute {
      string host
      numeric port
   }
   ~myatable.service_t service[]
}

We could populate it like this:

~myatable.webserver wsvr
wsvr.host = 'apollo'
wsvr.port = 80
wsvr.service[1:2].scheme = <'http', 'https'>
wsvr.service[1:2].context = <'images.php', 'documents.php'>
wsvr.service[2].password = 'secret'

Then output it to xml like this:

open mychannel,-of=wsvr.xml
xmlwrite mychannel,wsvr
mychannel.close

The resulting xml would look like this:

<?xml version="1.0" ?>
<webserver host="apollo" port="80">
   <service>
      <scheme>http</scheme>
      <context>images.php</context>
   </service>
   <service>
      <scheme>https</scheme>
	  <context>documents.php</context>
      <password>secret</password>
   </service>
</webserver>

Complex elements in xml, which map to Fire structures, an example of which is service in the above xml output, can themselves have values as well as having attributes and sub-elements with values.

Fire does not have this concept, but an additional member can be added to a Fire structure to do the same job. This additional member is known as a "parent text value" and is indicated by a -ptv switch. Consider adding such a member to the above definition of service_t.

class ~myatable.service_t {
   string tag_value,-ptv
   string scheme
   string context
   string password[]
}

We can then give this member a value in both the sub-structures of our object:

wsvr.service[1:2].tag_value = <'Image Service','Document Service'>

The xml output would now look like this:

<?xml version="1.0" ?>
<webserver host="apollo" port="80">
   <service>Image Service
      <scheme>http</scheme>
      <context>images.php</context>
   </service>
   <service>Document Service
      <scheme>https</scheme>
	  <context>documents.php</context>
      <password>secret</password>
   </service>
</webserver>

An alternative to the "parent text value" scenario is to specify that our service class inherits a string value, e.g.

class ~myatable.service_t,string {
   string scheme
   string context
   string password[]
}

and we would set the values slightly differently:

wsvr.service[1] = 'Image Service'
wsvr.service[2] = 'Document Service'

The xml output would be the same, whether or not we use the "parent text value" method.

Adding Schema Information

In the examples so far, none of the objects were associated with an xml schema, but it is a common requirement for data in an xml document to conform to a data schema, which defines rules for element order and value constraints. Conforming to a schema enables an xml parser to check for data validity.

Publishers of xml schemas usually supply both a readable text document (a namespace) and the schema definition itself (an xml file with the extension .xsd). XML documents referring to such schemas have 2 attributes attached to their root tag whose values are web addresses (aka urls) pointing to the namespace document and the .xsd schema definition.

A Fire structure whose output will conform to a schema can have both the asscociated urls (namespace and xsd), and a namespace prefix for use within the output xml data defined by means of switches, e.g.

class servicestype_t,-tag='services',-ns='http://www.xmarc.net/',\
     -xsd='http://www.xmarc.net/services.xsd',-pre='xmarc'

The appropriate attributes are then added to the root tag by the xmlwrite command whenever an xml output document is produced for an object of class servicestype_t. An example of such a root tag is:

<xmarc:services xmlns:xmarc="http://www.xmarc.net/"
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	 xsi:schemaLocation="http://www.xmarc.net/ http://www.xmarc.net/services.xsd">

If preferred, the namespace, prefix and schema can be specified (or overridden) with the xmlwrite command as an alternative to defining them for the class, e.g.

xmlwrite mychannel,myobj,-ns='http://www.xmarc.net/',\
     -xsd='http://www.xmarc.net/services.xsd',-pre='xmarc'

Typically you will not have to specify these attributes unless you are writing and designing your own xml schemas. The xsdread is available for you to create Fire structures automatically from xml schemas and it takes care of attaching the correct namespace and xsd information to the Fire structure class definitions.

Schema Creation

If you want to create your own xml schemas, there is a utility command xsdwrite to help you do this. This command is not intended for production purposes but gets you started with schemas if you are unfamiliar with them (or even if you are not).

Consider the Fire structures we defined in an earlier example:

class ~myatable.service_t,string {
   string scheme
   string context
   string password[]
}

class ~myatable.webserver {
   xml attribute {
      string host
      numeric port
   }
   ~myatable.service_t service[]
}

If we run the xsdwrite command on the outer class,

open mychannel,-of=webserver.xsd
xmlwrite mychannel,~myatable.webserver,-tag="webserver"
mychannel.close

we get the following output:

<?xml version="1.0" ?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
   <xsd:element name="webserver" type="webserver" />
   <xsd:complexType name="service_t" mixed="true">
      <xsd:sequence>
         <xsd:element name="scheme" type="xsd:string" />
         <xsd:element name="context" type="xsd:string" />
         <xsd:element name="password" type="xsd:string" minOccurs="0" maxOccurs="unbounded" />
      </xsd:sequence>
   </xsd:complexType>
   <xsd:complexType name="webserver">
      <xsd:sequence>
         <xsd:element name="service" type="service_t" minOccurs="0" maxOccurs="unbounded" />
      </xsd:sequence>
      <xsd:attribute name="host" type="xsd:string" use="required" />
      <xsd:attribute name="port" type="xsd:decimal" use="required" />
   </xsd:complexType>
</xsd:schema>

You can specify the namespace and schema urls if you wish, and give a namespace prefix as well, all with various switches. Consult the xsdwrite command reference for more information.

Prev Chapter