Sketches of a Native XML Database
Prerequisites:
- XQuery 1.0 and XPath 2.0 processors which compile expressions into a sort of bytecode
- A "virtual machine" for processing compiled XPath, XQuery (and eventually XSLT) expressions.
Structure:
For a given namespace or schema, each valid element name will be assigned a unique integer and each valid attribute name will be assigned an integer. Each namespace will also be assigned a unique integer value.
A record for an element will be as follows:
Integer for Element Name | Current file location of parent element record (0 for root element) | Length of Attribute Segment | Attribute Segment | Length of Tag Index Segment | Tag Index Segment
The attribute segment will be a sequence of attribute tokens in the following format : Integer for Namespace | Integer for Attribute Name | Current file location of attribute value | Current length of attribute value.
The tag index segment will be a sequence of tokens for child Element, Data, CData, Comment, and Processing Instruction nodes. The format will be as follows:
Element token: Token Type | Integer for Namespace | Integer for Element name | Current file location for start of element record.
Data, CData, Comment, Processing Instruction token: Token Type | Current file location of data | Length of data.
Element records may also contain three additional types of information:
- Triggers
- Permission, Ownership Information which applies to all child node
- Version information which indicates the last version in which this node was modified (thus previous versions can be easily retrieved by applying diff's to the node.
Indexes (for improving speed of search efficiency) and stored procedures (essentially precompiled XQuery/XPath expressions) may also be created.
Updates can be done using the XQuery Update extension and other extensions to XQuery can be created for assigning Read/Write permissions to certain nodes, versioning and creating indexes, triggers, and stored procedures.
