GDL XML Extension


This extension allows reading, writing and editing XML files.
Opening a file from the loaded library for reading is possible from all scripts, but writing is only enabled in the parameter, user interface and property scripts.
Available options for files outside of the loaded libraries:

  • Writing from 2D and 3D scripts are enabled
  • Reading data from external files is regarded as non-determinictic actions, see Background Conversion.

It implements a subset of the Document Object Model (DOM) interface.
XML is a text file that uses tags to structure data into a hierarchical system, similar to HTML.
An XML document can be modeled by a hierarchical tree structure whose nodes contain the data of the document.
The following node types are known by the extension:

  • Element: what is between a start-tag and an end-tag in the document,
    or for an empty-element it can be an empty-element tag. Elements have a name, may have attributes, and usually but not necessarily have content.
    It means that element type nodes can have child nodes. Attributes are held in an attribute list where each attribute has a different name and a text value.
  • Text: a character sequence. It cannot have child nodes.
  • Comment: text between the comment delimiters: <!– the comment itself –> .
    In the text of the comment each ‘-‘ character must be followed by a character different from ‘-‘.
    It also means that the following is illegal: <!– comment —> . Comment type nodes cannot have child nodes.
  • CDATASection: text between the CDATA section delimiters: <![CDATA[ the text itself ]]> .
    In a CDATA section characters that have special meaning in an XML document need not (and must not) be escaped.
    The only markup recognized is the closing “]]>”. CData section nodes cannot have child nodes.
  • Entity-reference: reference to a predefined entity.
    Such a node can have a read-only subtree and this subtree gives the value of the referenced entity.
    During the parsing of the document it can be chosen that entity references are translated into text nodes.

On the top level it is obligatory to have exactly one element type node (the root), and there can be several comment type nodes, as well.
The document type node of the DOM interface is not available through the extension’s interface.

name value
Element name of the tag “” (empty string)
Text “#text” the text content of the node
Comment “#comment” the text content of the node
CDATASection “#cdata-section” the text content of the node
Entity-reference name of the referenced entity “” (empty string)

For each node in the tree there is a name and a value string associated whose meanings depend on the type of the node:

Element: ELEM
Text: TXT
Comment: CMT
CDATA section: CDATA
Entity reference: EREF

The success or error code of an OPEN, INPUT or OUTPUT command can be retrieved
by the GetLastError instruction of the INPUT command.

Opening an XML Document

channel = OPEN (filter, filename, parameter_string)

filter: file extension. This should be ‘XML’.

 

filename: name and path of the file to open (or create), or an identifier name if the file is opened through a dialog box and the
file’s location is given by the user.

 

parameter_string: a sequence of character flags that determine the open-mode:
'r': open in read-only mode. In general only the INPUT command can be used.
'e': entity references are not translated into text nodes in the tree.
Without this flag there are no entity-references in the document structure.
'v': validity check is performed during reading in and writing out.
If a DTD exists in the document, the document’s structure must agree with it.
Without this flag a well-structured but invalid document can be read in and written out without error message.
'n': create a new file. If the file exists, the open will fail.
(After the OPEN the CreateDocument instruction must be the first to execute.)
'w': overwrite file with empty document if it exists. If it doesn’t exist, a new file will be created.
(After the OPEN the CreateDocument instruction must be the first to execute.)
'd': the file is obtained from the user in a dialog box.
In later runs it will be associated with the identifier given in the filename parameter of the OPEN command.
(If the identifier is already associated to a file, the dialog box will not be opened to the user.)
'f': the filename parameter contains a full path.
'l': the file is in the loaded library parts. Opening data file from the loaded library for reading is possible from all scripts, but writing is only enabled in the parameter, user interface and property scripts.
channel: used to identify the connection in subsequent I/O commands.

If you want to open an existing XML file for modification,
then none of the ‘r’, ‘n’ and ‘w’ flags must be set in the parameter string.
Only one of the ‘d’, ‘f’ and ‘l’ flags should be set.
If none of these flags is set then filename is considered to be a path relative to the user’s documents folder.

Reading an XML Document


DOM is an object-oriented model that cannot be adapted to a BASIC-like language like GDL directly.
To represent the nodes in the hierarchy tree we define position descriptors.
When we want to walk through the nodes of the tree, first we have to request a new position descriptor from the extension.
Originally a new descriptor points to the root element.
The descriptor is in fact a 32 bit identification number whose value has no interest for the GDL script.
The position it refers to can be changed as we move from one node in the tree to another.

r = INPUT (ch, recordID, fieldID, var1, var2, ...)

ch: channel returned by the OPEN command.
recordID: instruction name plus parameters.
fieldID: usually a position descriptor.
var1, var2, ...: optional list of variables receiving returned data.

INPUT instructions:

  • GetLastError: retrieve the result of the last operation

    recordID: “GetLastError”

    fieldID: ignored

    return values:

    var1: error code / ok

    var2: the explanation text of error / ok

  • NewPositionDesc: request for a new position descriptor

    recordID: “NewPositionDesc”

    fieldID: ignored

    return value: var1: the new position descriptor (initially refers to the root)

  • CopyPositionDesc: request for a new position descriptor whose starting node is taken from another descriptor.

    recordID: “CopyPositionDesc”

    fieldID: an existing position descriptor

    return value: var1: the new position descriptor (initially refers to where the descriptor given in fieldID refers to)

  • ReturnPositionDesc: when a position descriptor is no longer needed.

    recordID: “ReturnPositionDesc”

    fieldID: the position descriptor

    var1: ignored

    Call this instruction when a position descriptor received from the NewPositionDesc or CopyPositionDesc instructions is no longer used.

  • MoveToNode: change the position of a descriptor. (and retrieve the data of the new node)

    This instruction can be used for navigating in the tree hierarchy.

    recordID: “MoveToNode searchmode nodename nodetype nodenumber”

    fieldID: position descriptor

    searchmode (or movemode): the nodename parameter must contain a path that determines an element or entity reference node in the xml document.

    To specify an exact path, the Path movemode should be used. After this movemode only the required path should be present.

    The path is relative to the node given in fieldID. The delimiter is the ‘:’ character (which is otherwise an accepted character in an element’s name so this doesn’t work for all cases). The ‘..’ string in the path means a step to the parent node. The starting node can be different from an element or entity reference node, in which case the path must begin with ‘..’ to step back. If there are several element nodes on the same level with the same name then the first one is chosen.

    Move-modes:

    • ToParent: moves to the parent of the node given in fieldID.
    • ToNextSibling: moves to the next node on the same level.
    • ToPrevSibling: moves to the previous node on the same level.
    • ToFirstChild: moves to the first descendant of the fieldID node.
    • ToLastChild: moves to the last descendant of the fieldID node.
    Search-modes:

    • FromNextSibling: searching starts from the next node on the same level and it moves forward.
    • FromPrevSibling: searching starts from the node before fieldID and it moves backward on the same level.
    • FromFirstChild: searching starts from the first descendant of the fieldID node and moves forward.
    • FromLastChild: searching starts from the last descendant of the fieldID node and moves backward.

    nodename: the searching considers those nodes only whose name or value matches nodename. The * and ? characters in nodename are considered as wildcard characters. For element and entity reference type nodes the name is compared, while for text, comment and CDATA section nodes the value is compared. Default value: *

    nodetype: the searching considers those nodes only whose type is allowed by nodetype. The * means all types are allowed. Otherwise the type keywords can be combined with the + character to form the nodetype (it must be one word without spaces, like TXT+CDATA.) The default value is *

    nodenumber: if there are several matching nodes, this gives the number of the searched node in the sequence of matching nodes. (Starts from 1) Default value: 1

    return values:

    var1: name of the node

    var2: value of the node

    var3: type keyword of the node

    Example:

    We want to move backwards on the same level to the 2nd node that is an element or an entity reference and whose name starts with K:

    r = INPUT (ch, "MoveToNode FromPrevSibling K* ELEM+EREF 2", posDesc, name, val, type)

     

  • GetNodeData: retrieve the data of a given node.

    recordID: “GetNodeData”

    fieldID: the position descriptor

    return values:

    var1: name of the node

    var2: value of the node

    var3: type keyword of the node

  • NumberofChildNodes: gives the number of child nodes of a given node

    recordID: “NumberofChildNodes nodetype nodename”

    The following optional parameters can narrow the set of child nodes considered:

    nodetype: allowed node types as defined in the MoveToNode instruction

    nodename: allowed node names or values as defined in the MoveToNode instruction

    fieldID: position descriptor

    return values:

    var1: number of child nodes

  • NumberofAttributes: returns the number of attributes of an element node.

    recordID: “NumberofAttributes attrname”

    attrname: if present, it can narrow the set of attributes considered as only those attributes will be counted whose names (and not the values)
    match attrname. In attrname the * and ? characters are considered wildcard characters.

    fieldID: position descriptor (must refer to an element node)

    return values:

    var1: number of attributes

  • GetAttribute: return the data of an attribute of an element node

    recordID: “GetAttribute attrname attrnumber”

    fieldID: position descriptor (must refer to an element node)

    optional parameters:

    attrname: give the name of the attribute. The * and ? are considered wildcard characters. Default value: *

    attrnumber: If several attribute matches attrname, attrnumber chooses the attribute in the sequence of matching attributes.
    (Counting starts from 1.) Default value: 1

    return values:

    var1: value of the attribute

    var2: name of the attribute

  • Validate: check the validity of the document.

    The validity is not checked during a document modification instruction.
    It is checked during writing back the file to disk if the ‘v’ flag was set in the open-mode string.
    A validity check can be forced any time by the Validate instruction,
    however it can consume considerable amount of time and memory so it is not advisable to do so after every modification.

    recordID: “Validate”

    fieldID: ignored

    var1: ignored

Modifying an XML Document

OUTPUT ch, recordID, fieldID, var1, var2, ...

ch: channel returned by the OPEN command.
recordID: instruction name plus parameters.
fieldID: usually a position descriptor.
var1, var2, ...: additional input data.

OUTPUT instructions:

Most of the OUTPUT instructions are invalid for files opened in read-only mode.

This instruction can be called even if the file was opened in read-only mode.
In this case after the execution the document loses the read-only attribute, so it can be modified and saved to the new file location.

  • CreateDocument:

    recordID: “CreateDocument”

    fieldID: ignored

    var1: name of the document. This will be the tagname of the root element, as well.

    CreateDocument is allowed only if the file was opened in new-file or overwrite mode.
    In these modes this instruction must be the first to be executed in order to create the XML document.

  • NewElement: insert a new element type node in the document

    recordID: “NewElement insertpos”

    fieldID: a position descriptor relative to which the new node is inserted

    var1: name of the new element (element tag-name)

    insertpos can be:

    AsNextSibling: new element is inserted after the position given in fieldID

    AsPrevSibling: new element is inserted before the position given in fieldID

    AsFirstChild: new element is inserted as the first child of the node given in fieldID (which must be an element node)

    AsLastChild: new element is inserted as the last child of the node given in fieldID (which must be an element node)

  • NewText: insert a new text node in the document

    recordID: “NewText insertpos”

    fieldID: position descriptor

    var1: text to be inserted

    See also the NewElement.

  • NewComment: insert a new comment node in the document

    recordID: “NewComment insertpos”

    fieldID: position descriptor

    var1: text of the comment to be inserted

    See also the NewElement.

  • NewCDATASection: insert a new CDATA section node in the document

    recordID: “NewCDATASection insertpos”

    fieldID: position descriptor

    var1: text of the CDATA section to be inserted

    See also the NewElement.

  • Copy: make a copy of a subtree of the document under some node

    recordID: “Copy insertpos”

    fieldID: position descriptor relative to which the subtree is inserted

    var1: position descriptor giving the node of the subtree to be copied

    insertpos: same as in the NewElement

    The copied subtree remains unchanged. Position descriptors pointing to a certain node in the copied subtree will point to the same node after the copy.

  • Move: replace some subtree in the document to some other location

    recordID: “Move insertpos”

    fieldID: position descriptor relative to which the subtree is inserted

    var1: position descriptor giving the node of the subtree to be moved

    insertpos: same as in the NewElement

    The original subtree is deleted. Position descriptors pointing to some node in the moved subtree will point to the same node
    in the new position of the subtree.

  • Delete: delete a node and its subtree from the document

    recordID: “Delete”

    fieldID: position descriptor giving the node to delete

    var1: ignored

    All position descriptors pointing to some node in the deleted subtree become invalid.

  • SetNodeValue: change the value of a node

    recordID: “SetNodeValue”

    fieldID: position descriptor, it must refer to either a text, a comment or a CDATA section type node

    var1: new text value of the node

  • SetAttribute: change an attribute of an element node or create a new one

    recordID: “SetAttribute”

    fieldID: position descriptor, it must refer to an element type node

    var1: name of the attribute

    var2: text value of the attribute

    If the element already has an attribute with this name then its value is changed, otherwise a new attribute is added to the element’s list of attributes.

  • RemoveAttribute: removes an attribute of an element node

    recordID: “RemoveAttribute”

    fieldID: position descriptor, it must refer to an element type node

    var1: name of the attribute to remove

  • Flush: write the current document back to file

    recordID: “Flush”

    fieldID: ignored

    var1: ignored

    If the file was opened in validate mode, then only a valid document is saved.

  • ChangeFileName: associate another file with the current document

    recordID: “ChangeFileName”

    fieldID: new file path

    var1: gives how fieldID should be interpreted.
    If var1 is an empty string, fieldID contains a path relative to the user’s documents folder.
    ‘d’ means the file’s location is obtained from the user from a file dialog box (see open-mode flags in the section called “Opening an XML Document”).
    ‘l’ means the file is taken from the loaded libraries.
    ‘f’ means fieldID contains a full path.

Table 12.9. Error codes and messages

0 “Ok”
-1 “Add-on Initialization Failed”
-2 “Not Enough Memory”
-3 “Wrong Parameter String”
-4 “File Dialog Error”
-5 “File Does Not Exist”
-6 “XML Parse Error”
-7 “File Operation Error”
-8 “File Already Exists”
-9 “This channel is not open”
-10 “Syntax Error”
-11 “Open Error”
-12 “Invalid Position Descriptor”
-13 “Invalid Node Type for this Operation”
-14 “No Such Node Found”
-15 “Internal Error”
-16 “Parameter Error”
-17 “No Such Attribute Found”
-18 “Invalid XML Document”
-19 “Unhandled Exception”
-20 “Read-Only Document”
-21 “CreateDocument Not Allowed”
-22 “Document Creation Failed”
-23 “Setting NodeValue Failed”
-24 “Move Not Allowed”
-25 “Delete Not Allowed”
-26 “SetAttribute Not Allowed”
-27 “Format File Error”
-28 “Insertion (or Copy) Not Allowed”
-29 “Node Creation Failed”
-30 “Bad String”
-31 “Invalid Name”