OneFS distributed file system
Encyclopedia
The OneFS file system is a distributed networked file system
Distributed file system
Network file system may refer to:* A distributed file system, which is accessed over a computer network* Network File System , a specific brand of distributed file system...

 designed by Isilon Systems
Isilon Systems
Isilon Systems, a division of EMC, is headquartered in Seattle, Washington, USA and sells clustered storage systems and software for digital content and other unstructured data, which includes but is not limited to video, audio, digital images, computer models, PDF files, scanned information, and...

 for use in its Isilon IQ storage appliances. OneFS is a FreeBSD variant and utilizes zsh as its shell. OneFS has its own specialized command set, all of which start with "isi", which is used to administer the system.

On-disk Structure

All data structures in the OneFS file system maintain their own protection information. This means in the same filesystem, one file may be protected at +1 (basic parity protection) while another may be protected at +4 (resilient to four failures) while yet another file may be protected at 2x (mirror
Disk mirroring
In data storage, disk mirroring or RAID1 is the replication of logical disk volumes onto separate physical hard disks in real time to ensure continuous availability...

ing); this feature is referred to as FlexProtect. FlexProtect is also responsible for automatically rebuilding the data in the event of a failure. The protection levels available are based on the number of nodes in the cluster and follow the Reed Solomon Algorithm
Reed–Solomon error correction
In coding theory, Reed–Solomon codes are non-binary cyclic error-correcting codes invented by Irving S. Reed and Gustave Solomon. They described a systematic way of building codes that could detect and correct multiple random symbol errors...

. Blocks for an individual file are spread across the nodes; for example, block 0 may be on Node 3, block 1 on Node 1, and the related parity
Parity bit
A parity bit is a bit that is added to ensure that the number of bits with the value one in a set of bits is even or odd. Parity bits are used as the simplest form of error detecting code....

 block on Node 5. This allows entire nodes to fail without losing access to any data. File metadata
Metadata
The term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...

, directories, snapshot
Snapshot (computer storage)
In computer systems, a snapshot is the state of a system at a particular point in time. The term was coined as an analogy to that in photography. It can refer to an actual copy of the state of a system or to a capability provided by certain systems....

 structures, quota
Disk quota
A disk quota is a limit set by a system administrator that restricts certain aspects of file system usage on modern operating systems. The function of using disk quotas is to allocate limited disk space in a reasonable way.-Types of quotas:...

s structures, and a logical inode
Inode
In computing, an inode is a data structure on a traditional Unix-style file system such as UFS. An inode stores all the information about a regular file, directory, or other file system object, except its data and name....

 mapping structure are all based on mirrored B+ tree
B+ tree
In computer science, a B+ tree or B plus tree is a type of tree which represents sorted data in a way that allows for efficient insertion, retrieval and removal of records, each of which is identified by a key. It is a dynamic, multilevel index, with maximum and minimum bounds on the number of...

s. Block addresses are generalized 64-bit pointers that reference (node, drive, blknum) tuple
Tuple
In mathematics and computer science, a tuple is an ordered list of elements. In set theory, an n-tuple is a sequence of n elements, where n is a positive integer. There is also one 0-tuple, an empty sequence. An n-tuple is defined inductively using the construction of an ordered pair...

s. The native block size is 8192 bytes; inodes are 512 bytes on disk.

One distinctive characteristic of OneFS is that metadata is spread throughout the nodes in a homogeneous fashion. There are no dedicated metadata servers. The only piece of metadata that is replicated on every node is the address list of root btree blocks of the inode mapping structure. Everything else can be found from that starting point, following the generalized 64-bit pointers.

Clustering

Nodes running OneFS must be connected together with a high performance, low-latency back-end network for optimal performance. OneFS 1.0-3.0 used Gigabit Ethernet as that back-end network. Starting with OneFS 3.5, Isilon offered Infiniband
InfiniBand
InfiniBand is a switched fabric communications link used in high-performance computing and enterprise data centers. Its features include high throughput, low latency, quality of service and failover, and it is designed to be scalable...

 models. Now all nodes sold utilize an Infiniband back-end.

Data, metadata, locking, transaction, group management, allocation, and event traffic go over the back-end RPC system. All data and metadata transfers are zero-copy
Zero-copy
"Zero-copy" describes computer operations in which the CPU does not perform the task of copying data from one memory area to another. This is most often used to save on processing power and memory use when sending files over a network.- Principle :...

. All modification operations to on-disk structures are transactional and journaled
Journaling file system
A journaling file system is a file system that keeps track of the changes that will be made in a journal before committing them to the main file system...

.

Protocols

OneFS is equipped with options for accessing storage via NFS, CIFS/SMB, FTP and HTTP. It can utilize authentication models for Active Directory, LDAP, and NIS. It is also capable of interfacing with backup devices using NDMP and has iSCSI support.

Versions

  • 1.0
  • 2.0
  • 3.0
  • 3.5
  • 4.0
  • 4.1
  • 4.5
    • 4.5.4
  • 4.6
  • 4.7
    • 4.7.1
    • 4.7.7
    • 4.7.8
    • 4.7.9
    • 4.7.10
    • 4.7.11
  • 5.0
    • 5.0.0
    • 5.0.1
    • 5.0.2
    • 5.0.3
    • 5.0.4
    • 5.0.5
    • 5.0.6
    • 5.0.7
    • 5.0.8
  • 5.5
    • 5.5.1
    • 5.5.2
    • 5.5.3 - Adds ability to update OneFS with rolling reboots of individual nodes.
    • 5.5.4 - Adds iSCSI
    • 5.5.5
    • 5.5.6
    • 5.5.7 (based on FreeBSD 6.1)
  • 6.0 - Can scale to 10.4 PB of storage in a single file system
    • 6.0.1
    • 6.0.2
    • 6.0.3
    • 6.0.4
  • 6.5
    • 6.5.1
    • 6.5.2
    • 6.5.3
    • 6.5.4

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK