Streaming Transformations for XML
Encyclopedia
Streaming Transformations for XML (STX) is an XML transformation language
XML transformation language
An XML transformation language is a programming language designed specifically to transform an input XML document into an output XML document which satisfies some specific goal.There are two special cases of transformation:...

 intended as a high-speed, low memory consumption alternative to XSLT
XSLT
XSLT is a declarative, XML-based language used for the transformation of XML documents. The original document is not changed; rather, a new document is created based on the content of an existing one. The new document may be serialized by the processor in standard XML syntax or in another format,...

 version 1.0 and 2.0. Current work on XSLT 3.0 includes Streaming capabilities.

Overview

STX is an XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....

 standard for efficient processing of stream-based XML. As we will discover, XSLT is not well suited to stream based processing, and STX fills this niche.

Conventional XML processing involves loading the entire XML document into memory for use. This is as opposed to SAX
Simple API for XML
SAX is an event-based sequential access parser API developed by the XML-DEV mailing list for XML documents. SAX provides a mechanism for reading data from an XML document that is an alternative to that provided by the Document Object Model...

 which streams XML
Streaming XML
Streaming XML means dynamic data which is in an XML format.Another popular use of this term refers to one method of consuming XML data – largely known as Simple API for XML. This is via asynchronous events that are generated as the XML data is parsed. In this context, the consumer streams through...

 events such as "open element" "close element" "text node" (and so on) so that other software that can begin interpreting these immediately -- before the end of the file is reached. Unfortunately some software can't effectively use XML fragments this way and must build up the whole document to begin processing. So is the case with XSLT. Because XSLT's XPath
XPath
XPath is a language for selecting nodes from an XML document. In addition, XPath may be used to compute values from the content of an XML document...

 can select any node throughout the document it must have the entire document available in memory. Understandably, this could be perceived as a bottleneck.

STX only allows queries immediately surrounding the current node so it can quickly start transforming and outputting SAX event nodes as they arrive. As it can discard nodes immediately after processing the memory use is significantly lower than that of XSLT. Having a limited query scope is a defining characteristic of STX.

This architectural decision intentionally marginalises STX as a niche language. Indeed, it would be wrong to say that STX is a general purpose transformation language; however, if your transformation needs can be met by STX then it's an efficient and smart choice.

Specifications

STX's query language is called STXPath and is based on XPath 2.0
XPath 2.0
XPath 2.0 is the current version of the XPath language defined by the World Wide Web Consortium, W3C. It became a recommendation on 23 January 2007....

.

Implementations of STX are available in Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

 and Perl
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions and become widely popular...

.

Similar projects

Unlike STX which is declared using an XML syntax, these two projects associate SAX events with callback
Callback (computer science)
In computer programming, a callback is a reference to executable code, or a piece of executable code, that is passed as an argument to other code. This allows a lower-level software layer to call a subroutine defined in a higher-level layer....

 functions:
  • Xineo OAX
  • SAX Adapter

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK