Fast Infoset
Encyclopedia
Fast Infoset is an international standard that specifies a binary encoding
Binary XML
Binary XML refers to any specification which defines the compact representation of XML in a binary format. While there are several competing formats, none has been widely adopted by a standards organization or accepted as a de facto standard...

 format for the XML Information Set
XML Information Set
XML Information Set is a W3C specification describing an abstract data model of an XML document in terms of a set of information items...

 (XML Infoset) as an alternative to the XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....

 document format. It aims to provide more efficient serialization than the text-based XML format.

One can think of FI as gzip
Gzip
Gzip is any of several software applications used for file compression and decompression. The term usually refers to the GNU Project's implementation, "gzip" standing for GNU zip. It is based on the DEFLATE algorithm, which is a combination of Lempel-Ziv and Huffman coding...

for XML, though FI aims to optimize both document size and processing performance, whereas gzip optimizes only the size. While the original formatting is lost, no information is lost in the conversion from XML to FI and back to XML.

The Fast Infoset specification is defined by both the ITU-T
ITU-T
The ITU Telecommunication Standardization Sector is one of the three sectors of the International Telecommunication Union ; it coordinates standards for telecommunications....

 and the ISO standards bodies. FI is officially named ITU-T Rec. X.891 and ISO/IEC 24824-1 (Fast Infoset), respectively. However, it is commonly referred to by the name Fast Infoset. The standard was published by ITU-T on May 14, 2005, and by ISO on May 4, 2007.

The Fast Infoset standard can be downloaded from the ITU website at
http://www.itu.int/rec/T-REC-X.891-200505-I/en. There are no intellectual property restrictions on its implementation and use.

A common misconception is that FI requires ASN.1 tool support. Although the formal specification uses ASN.1 formalisms, ASN.1 tools are not required by implementations.

Structure

The underlying file format is ASN.1
Abstract Syntax Notation One
Data generated at various sources of observation need to be transmitted to one or more locations that process it to generate useful results. For example, voluminous signal data collected by a radio telescope from outer space. The system recording the data and the system processing it later may be...

, with tag/length/value blocks. Text values of attributes and elements are therefore stored with length prefixes rather than end delimiters, so there is no need to escape special characters. There is also no need for any end tags, and binary data need not be base64 encoded.

Although ASN.1 is used for storage, Fast Infoset is a higher level protocol built upon it. In particular, element and attribute names are stored within the octet stream, unlike raw ASN.1. This means that it is possible to recover a conventional XML file from the binary stream without the need to reference any XML Schema. It does not attempt to convert an XML Schema directly into an ASN.1 definition. (ASN.1 "Tags" are just type names, e.g. String, Integer, or complex types.)

An index table is built for most strings, which includes element and attribute names, and their values. This means that the text of repeated tags and values only appears once per document. The details are complex.

Reference implementation

A Java implementation of the FI specification is available as part of the GlassFish
GlassFish
GlassFish is an open source application server project started by Sun Microsystems for the Java EE platform and now sponsored by Oracle Corporation. The supported version is called Oracle GlassFish Server...

 project. The library is open source
Open source
The term open source describes practices in production and development that promote access to the end product's source materials. Some consider open source a philosophy, others consider it a pragmatic methodology...

 and is distributed under the terms of the Apache License
Apache License
The Apache License is a copyfree free software license authored by the Apache Software Foundation . The Apache License requires preservation of the copyright notice and disclaimer....

 2.0. Several projects use this implementation, including the reference implementation for JAX-WS
JAX-WS
The Java API for XML Web Services is a Java programming language API for creating web services. It is part of the Java EE platform from Sun Microsystems. Like the other Java EE APIs, JAX-WS uses annotations, introduced in Java SE 5, to simplify the development and deployment of web service clients...

 used in GlassFish Metro
GlassFish Metro
Metro is an opensource web service stack that is a part of the GlassFish project, though it can also be used in a stand-alone configuration. Components of metro include JAXB RI, JAX-WS RI, SAAJ RI, StAX and WSIT...

.

Performance

Because Fast Infosets are compressed as part of the XML generation process, they are much faster than using Zip-style compression algorithms on an XML stream, although they can produce slightly larger files.

SAX-type parsing performance of Fast Infoset is also much faster than parsing performance of XML 1.0, even without any Zip-style compression. Typical increases in parsing speed observed for the reference Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

 implementation are a factor of 10 compared to Java Xerces
Xerces
Xerces is a collection of software libraries for parsing, validating, serializing and manipulating XML. The library implements a number of standard APIs for XML parsing, including DOM, SAX and SAX2. The implementation is available in Java, C++ and Perl programming languages.-External...

, and a factor of 4 compared to the Piccolo driver (one of the fastest Java-based XML parsers).

Typical applications

Portable Devices - With mobile devices typically having access to low bandwidth data connections, and have slower CPUs. This can make Fast Infoset a better choice, lowering both data transmission and data processing times.

Persisting Large Volumes of Data - When persisting XML either to file or a database, the volume of data your system produces can often get out of hand. This has a number of detrimental effects; the access times go up as you're reading more data, CPU load goes up as XML data takes more effort to process, and your storage costs go up. By persisting your XML data in Fast Infoset format, it is possible to reduce the data volume by up to 80 percent.

Passing XML via the internet - As soon as an application starts passing information over the internet, one of the main bottlenecks is bandwidth. If you send reasonable chunks of data, this bottleneck can seriously degrade the performance of your client applications and limit your server's ability to process requests. Reducing the amount of data moving across the internet reduces the time it takes a message to be sent or received, while increasing the number of transactions a server can process per hour.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK