MIX (Email)
Encyclopedia
MIX is a high-performance, indexed, on-disk email storage system that is designed for use with the IMAP protocol. MIX was designed by Mark Crispin
Mark Crispin
Mark Crispin is best known as the father of the IMAP protocol, having invented it in 1985 during his time at the Stanford Knowledge Systems Laboratory. He is the author or co-author of numerous RFCs; and is the principal author of UW IMAP, one of the reference implementations of the IMAP4rev1...

, the author of the IMAP protocol. Server support for it has been included in releases of UW IMAP
UW IMAP
The UW IMAP server is the reference server implementation of the IMAP protocol. Unlike other server implementations, it is designed to be aggressively compatible with existing legacy mail stores and systems, and to be "plug-and-play" installable without requiring any site-specific configuration.UW...

 since 2006, Panda IMAP, and Messaging Architects
Messaging Architects
Messaging Architects is a Canadian software company specializing in e-mail products. The flagship product is Netmail, an integrated email management platform. Netmail helps your organization manage the full lifecyle of email – from the moment a message is created through its end of life destruction...

 Netmail. MIX is also supported directly by the Alpine
Alpine (e-mail client)
Alpine is a free software email client developed at the University of Washington.Alpine 1.0 was publicly released on December 20, 2007.The name "Alpine" stands for Alternatively Licensed Program for Internet News and Email....

 e-mail client.

At Messaging Architects
Messaging Architects
Messaging Architects is a Canadian software company specializing in e-mail products. The flagship product is Netmail, an integrated email management platform. Netmail helps your organization manage the full lifecyle of email – from the moment a message is created through its end of life destruction...

, Crispin is developing an extended version of MIX with additional capabilities (see below under "Extensions").

Design

MIX mailboxes are directories containing several types of files, including a metadata file, an index file, a dynamic status data file, a threading/sorting cache file, and a collection of files containing message content. MIX mailboxes can also contain subordinate mailboxes, which are implemented as sub directories within the MIX directory.

The MIX format was designed with an emphasis on very high scalability, reliability, and performance, while efficiently supporting modern features of the IMAP protocol. MIX has been used successfully with mailboxes of 750,000 messages.

The base level MIX format has four files: a metadata file, an index file, a status file, and some set of message data files. The metadata file contains base-level data applicable to the entire mailbox; i.e., the UID validity, last assigned UID, and list of keywords. The index file contains pointers to each unexpunged message in the message data files, along with flags, size, and IMAP internaldate data. The status file contains per-message flags and keywords.

By design, it is possible to recover the mailbox into a usable state if any of these files is lost or corrupted. For example, it is possible to rebuild the index file by reading each of the data files, with no consequence other than the possible "unexpunging" of an expunged message that had not yet had its space recovered.

Another important part of the MIX design is that no file is modified unless the data specific to that file is altered; thus a flag change alters the status file but not the metadata or index files. This reduces the impact of any system event that corrupts a file write in progress.

Each file also has a "modification sequence" which is incremented each time the file is changed. When a MIX implementation updates from a file, if the modification sequence is unchanged it closes the file at once without reading it further. In addition, each status file entry also has a modification sequence, which permits lossless synchronization of multiple consumer message flag/keyword updates/

Extensions

MIX allows for implementation-specific extensions. All MIX implementations must be interchangeable at the base level, but are not required to implement extensions and must tolerate the absence of extensions.

The UW IMAP
UW IMAP
The UW IMAP server is the reference server implementation of the IMAP protocol. Unlike other server implementations, it is designed to be aggressively compatible with existing legacy mail stores and systems, and to be "plug-and-play" installable without requiring any site-specific configuration.UW...

 and Panda IMAP implementations of MIX have a sort cache file that contains data used by the IMAP SORT and THREAD operators. This permits these operators to load most (if not all) of the data they need without having to parse it from message data.

The Messaging Architects
Messaging Architects
Messaging Architects is a Canadian software company specializing in e-mail products. The flagship product is Netmail, an integrated email management platform. Netmail helps your organization manage the full lifecyle of email – from the moment a message is created through its end of life destruction...

 implementation of MIX has extended mailbox metadata (currently used to hold the mailbox's display name), message metadata (used for multiple purposes including a JSON representation of the message structure), and a global modification sequence (thus permitting a fast check for mailbox update without having to check the modification sequence in multiple files). Messaging Architects' implementation also has a "virtual mailbox" or stubbing capability, in which a message in a mailbox is actually a pointer to a message in another mailbox.

Comparisons with other mail storage formats

MIX can be considered a hybrid between the maildir
Maildir
The Maildir e-mail format is a common way of storing e-mail messages, where each message is kept in a separate file with a unique name, and each folder is a directory...

 (single message per file) and mbox
Mbox
mbox is a generic term for a family of related file formats used for holding collections of electronic mail messages. All messages in an mbox mailbox are concatenated and stored as plain text in a single file...

 (single file per mailbox) types of email storage formats.

Versus maildir

MIX has a similarity to maildir
Maildir
The Maildir e-mail format is a common way of storing e-mail messages, where each message is kept in a separate file with a unique name, and each folder is a directory...

, in that MIX mailboxes are directories rather than single files.

Unlike maildir, however, MIX supports an index file for fast opens and mailbox scanning. Where maildir stores each message in its own file on disk, MIX can aggregate messages into message files, according to the configured size limit for a message file. Messages larger than the size limit are not aggregated. A MIX directory will tend to have a fewer number of files than a corresponding maildir mailbox as a result, which can be advantageous on certain operating systems. MIX has support for efficient retrieval and modification of metadata and status information.

MIX also aggregates multiple smaller messages into single data files of up to 1MB in size (larger messages get a data file to themselves). This reduces the number of nodes required in the directory, which is important for performance and scalability. For example, the ext3 filesystem is limited to 32,000 nodes (31,998 usable) which imposes a corresponding limit on a mail store that has a separate file for each message.

The MIX mailbox format requires more rigorous locking support from the operating system than maildir, and was explicitly not designed to support being written to over NFS.

Maildir, on the other hand, was designed to work in an NFS environment. Maildir enjoys wider client, server, and tool support than MIX.

Versus mbox

MIX enjoys considerable optimization versus the common mbox
Mbox
mbox is a generic term for a family of related file formats used for holding collections of electronic mail messages. All messages in an mbox mailbox are concatenated and stored as plain text in a single file...

mail format. MIX has a binary index to accelerate scanning and retrieval of messages, whereas mbox requires full linear scans to extract messages. Like maildir, and unlike mbox, MIX supports mail boxes that contain both messages and subordinate mailboxes. MIX supports multiple clients concurrently reading and writing to individual mailboxes, which can not be achieved with mbox.

On the other hand, the mbox format is far more widely supported than MIX. mbox is a ubiquitous mailbox file format, and is often used as a greatest common factor exchange format.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK