Netstrings
Encyclopedia
In computer programming
Computer programming
Computer programming is the process of designing, writing, testing, debugging, and maintaining the source code of computer programs. This source code is written in one or more programming languages. The purpose of programming is to create a program that performs specific operations or exhibits a...

, a netstring is a formatting method for byte strings
String (computer science)
In formal languages, which are used in mathematical logic and theoretical computer science, a string is a finite sequence of symbols that are chosen from a set or alphabet....

 that uses a declarative notation to indicate the size of the string.

Netstrings store the byte length of the data that follows, making it easier to unambiguously pass text and byte data between programs that could be sensitive to values that could be interpreted as delimiters
Delimiter
A delimiter is a sequence of one or more characters used to specify the boundary between separate, independent regions in plain text or other data streams. An example of a delimiter is the comma character, which acts as a field delimiter in a sequence of comma-separated values.Delimiters represent...

 or terminators (such as a null character
Null character
The null character , abbreviated NUL, is a control character with the value zero.It is present in many character sets, including ISO/IEC 646 , the C0 control code, the Universal Character Set , and EBCDIC...

).

The format consists of the string's length written using ASCII digits, followed by a colon, the byte data, and a comma. "Length" in this context means "number of 8-bit units", so if the string is, for example, encoded using UTF-8
UTF-8
UTF-8 is a multibyte character encoding for Unicode. Like UTF-16 and UTF-32, UTF-8 can represent every character in the Unicode character set. Unlike them, it is backward-compatible with ASCII and avoids the complications of endianness and byte order marks...

, this may or may not be identical to the number of textual characters that are present in the string.

For example, the text "hello world!" encodes as:
12:hello world!,
And an empty string as:
0:,
The comma makes it slightly simpler for humans to read netstrings that are used as adjacent records, and provides weak verification of correct parsing.
Note that without the comma, the format mirrors how Bencode
Bencode
Bencode is the encoding used by the peer-to-peer file sharing system BitTorrent for storing and transmitting loosely structured data.It supports four different types of values:* byte strings,* integers,* lists, and* dictionaries ....

 encodes strings.

Since the format is easy to generate and to parse
Parsing
In computer science and linguistics, parsing, or, more formally, syntactic analysis, is the process of analyzing a text, made of a sequence of tokens , to determine its grammatical structure with respect to a given formal grammar...

, it is easy to support by programs written in different programming languages. In practice, netstrings are often used to simplify exchange of bytestrings, or lists of bytestrings.
For example, see its use in the Simple Common Gateway Interface
Simple Common Gateway Interface
The Simple Common Gateway Interface is a protocol for applications to interface with HTTP servers, as an alternative to the CGI protocol...

 (SCGI) and the Quick Mail Queuing Protocol (QMQP) .

Netstrings avoid complications that arise in trying to embed arbitrary data in delimited formats. For example, XML may not contain certain byte values and requires a nontrivial combination of escaping
Escape character
In computing and telecommunication, an escape character is a character which invokes an alternative interpretation on subsequent characters in a character sequence. An escape character is a particular case of metacharacters...

 and delimiting
Delimiter
A delimiter is a sequence of one or more characters used to specify the boundary between separate, independent regions in plain text or other data streams. An example of a delimiter is the comma character, which acts as a field delimiter in a sequence of comma-separated values.Delimiters represent...

, while generating multipart MIME messages involves choosing a delimiter that must not clash with the content of the data.

Note that since netstrings pose no limitations on the contents of the data they store, netstrings can not be embedded verbatim in most delimited formats without the possibility of interfering with the delimiting of the containing format.

In the context of network programming it is potentially useful that the receiving program is informed of the size of the data that follows, as it can allocate exactly enough memory and avoid the need for reallocation to accommodate more data.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK