Log-structured file system
Encyclopedia
A log-structured filesystem is a file system
File system
A file system is a means to organize data expected to be retained after a program terminates by providing procedures to store, retrieve and update data, as well as manage the available space on the device which contain it. A file system organizes data in an efficient manner and is tuned to the...

 design first proposed in 1988 by John K. Ousterhout and Fred Douglis. Designed for high write throughput, all updates to data and metadata are written sequentially to a continuous stream, called a log. The design was first implemented by Ousterhout and Mendel Rosenblum
Mendel Rosenblum
Mendel Rosenblum is an associate professor of Computer Science at Stanford University, and one of the co-founders of VMware. Since 2008 he is a Fellow of the Association for Computing Machinery "for contributions to reinventing virtual machines", and had previously received the ACM SIGOPS Mark...

.

Rationale

Conventional file systems tend to lay out files with great care for spatial locality and make in-place changes to their data structures in order to perform well on optical and magnetic disks, which tend to seek relatively slowly.

The design of log-structured file systems is based on the hypothesis that this will no longer be effective because ever-increasing memory sizes on modern computers would lead to I/O becoming write-heavy because reads would be almost always satisfied from memory cache. A log-structured file system thus treats its storage as a circular log
Circular buffer
A circular buffer, cyclic buffer or ring buffer is a data structure that uses a single, fixed-size buffer as if it were connected end-to-end.This structure lends itself easily to buffering data streams.-Uses:...

 and writes sequentially to the head of the log.

This has several important side effects:
  • Write throughput on optical and magnetic disks is improved because they can be batched into large sequential runs and costly seeks are kept to a minimum.
  • Writes create multiple, chronologically-advancing versions of both file data and meta-data. Some implementations make these old file versions nameable and accessible, a feature sometimes called time-travel or snapshotting
    Snapshot (computer storage)
    In computer systems, a snapshot is the state of a system at a particular point in time. The term was coined as an analogy to that in photography. It can refer to an actual copy of the state of a system or to a capability provided by certain systems....

    . This is very similar to a versioning file system
    Versioning file system
    A versioning file system is any computer file system which allows a computer file to exist in several versions at the same time. Thus it is a form of revision control. Most common versioning file systems keep a number of old copies of the file. Some limit the number of changes per minute or per...

    .
  • Recovery from crashes is simpler. Upon its next mount, the file system does not need to walk all its data structures to fix any inconsistencies, but can reconstruct its state from the last consistent point in the log.


Log-structured file systems, however, must reclaim free space from the tail of the log to prevent the file system from becoming full when the head of the log wraps around to meet it. The tail can release space and move forward by skipping over data for which newer versions exist farther ahead in the log. If there are no newer versions, then the data is moved and appended to the head.

To reduce the overhead incurred by this garbage collection
Garbage collection (computer science)
In computer science, garbage collection is a form of automatic memory management. The garbage collector, or just collector, attempts to reclaim garbage, or memory occupied by objects that are no longer in use by the program...

, most implementations avoid purely circular logs and divide up their storage into segments. The head of the log simply advances into non-adjacent segments which are already free. If space is needed, the least-full segments are reclaimed first. This decreases the I/O load of the garbage collector, but becomes increasingly ineffective as the file system fills up and nears capacity.

Implementations

  • John K. Ousterhout and Mendel Rosenblum
    Mendel Rosenblum
    Mendel Rosenblum is an associate professor of Computer Science at Stanford University, and one of the co-founders of VMware. Since 2008 he is a Fellow of the Association for Computing Machinery "for contributions to reinventing virtual machines", and had previously received the ACM SIGOPS Mark...

     implemented the first log-structured file system for the Sprite operating system
    Sprite operating system
    Sprite was an experimental Unix-like distributed operating system developed at the University of California, Berkeley by John Ousterhout's research group between 1984 and 1992. Its notable features included support for single system image on computer clusters and for the introduction of the...

     in 1992.
  • BSD-LFS, an implementation by Margo Seltzer
    Margo Seltzer
    Margo Ilene Seltzer is a professor and researcher in computer systems. Currently she is the Herchel Smith Professor of Computer Science and a Harvard College Professor in the School of Engineering and Applied Sciences at Harvard University, where she is active in the Systems Research Group.Dr...

     was added to 4.4BSD, and was later ported to 386BSD
    386BSD
    386BSD, sometimes called "Jolix", was a free Unix-like operating system based on BSD, first released in 1992. It ran on PC compatible computer systems based on the Intel 80386 microprocessor...

    . It lacked support for snapshots. It was removed from FreeBSD and OpenBSD, but still lives on in NetBSD
    NetBSD
    NetBSD is a freely available open source version of the Berkeley Software Distribution Unix operating system. It was the second open source BSD descendant to be formally released, after 386BSD, and continues to be actively developed. The NetBSD project is primarily focused on high quality design,...

    .
  • Plan 9
    Plan 9 from Bell Labs
    Plan 9 from Bell Labs is a distributed operating system. It was developed primarily for research purposes as the successor to Unix by the Computing Sciences Research Center at Bell Labs between the mid-1980s and 2002...

    's Fossil
    Fossil (file system)
    Fossil is the default file system in Plan 9 from Bell Labs. It serves the network protocol 9P and runs as a user space daemon, like most Plan 9 file servers. Fossil is different from most other file systems due to its snapshot/archival feature. It can take snapshots of the entire file system on...

     file system is also log-structured and supports snapshots.
  • NILFS
    NILFS
    NILFS is a log-structured file system implementation for Linux. It is being developed by Nippon Telegraph and Telephone Corporation CyberSpace Laboratories and released under the terms of the GNU General Public License .Version 2 of the filesystem, known as NILFS2, is included in Linux kernel...

     is a log-structured file system implementation for Linux
    Linux
    Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

     by NTT/Verio which supports snapshots.
  • LinLogFS (formerly dtfs) and LFS (http://logfs.sourceforge.net/) are log-structured file system implementations for Linux. The latter was part of Google Summer of Code 2005
    Google Summer of Code
    The Google Summer of Code is an annual program, first held from May to August 2005, in which Google awards stipends to hundreds of students who successfully complete a requested free or open-source software coding project during the summer...

    . Both projects have been abandoned.
  • LFS is another log-structured file system for Linux developed by Charles University, Prague. It was to include support for snapshots and indexed directories, but development has since ceased.
  • ULFS is a User-Level Log-structured File System (http://ulfs.sf.net) using FUSE (http://fuse.sf.net).
  • CASL is a proprietary log-structured filesystem that uses Solid State Devices to cache traditional hard drives (http://www.nimblestorage.com/products/architecture/).
  • SISL is a log-structured filesystem with deduplication and was designed by DataDomain (http://www.datadomain.com/products/SISL.html).


Some kinds of storage media, such as flash memory
Flash memory
Flash memory is a non-volatile computer storage chip that can be electrically erased and reprogrammed. It was developed from EEPROM and must be erased in fairly large blocks before these can be rewritten with new data...

 and CD-RW
CD-RW
A CD-RW is a rewritable optical disc. It was introduced in 1997, and was known as "CD-Writable" during development. It was preceded by the CD-MO, which was never commercially released....

, slowly degrade as they are written to and have a limited number of erase/write cycles at any one location. Log-structured file systems are sometimes used on these media because they make fewer in-place writes and thus prolong the life of the device by wear leveling
Wear leveling
Wear leveling is a technique for prolonging the service life of some kinds of erasable computer storage media, such as Flash memory used in solid-state drives and USB Flash drives...

. The more common such file systems include:
  • UDF
    Universal Disk Format
    Universal Disk Format is an implementation of the specification known as ISO/IEC 13346 and ECMA-167 and is an open vendor-neutral file system for computer data storage for a broad range of media. In practice, it has been most widely used for DVDs and newer optical disc formats, supplanting ISO 9660...

     is a file system commonly used on optical disc
    Optical disc
    In computing and optical disc recording technologies, an optical disc is a flat, usually circular disc which encodes binary data in the form of pits and lands on a special material on one of its flat surfaces...

    s.
  • JFFS
    JFFS
    The Journaling Flash File System is a log-structured file system for use on NOR flash memory devices on the Linux operating system. It has been superseded by JFFS2.- Design :...

     and its successor JFFS2
    JFFS2
    Journalling Flash File System version 2 or JFFS2 is a log-structured file system for use with flash memory devices. It is the successor to JFFS. JFFS2 has been included in the Linux kernel since the 2.4.10 release. JFFS2 is also available for a couple of bootloaders like Das U-Boot, Open...

     are simple Linux
    Linux
    Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

     file systems intended for raw flash-based devices.
  • UBIFS
    UBIFS
    The Unsorted Block Image File System is a successor to JFFS2, and competitor to LogFS, as a file system for use with raw flash memory media. Development began in earnest in 2007, with the first stable release made to Linux kernel 2.6.27 in October 2008.Note that UBIFS works on top of an Unsorted...

     is a filesystem for raw NAND flash media and also intended to replace JFFS2
    JFFS2
    Journalling Flash File System version 2 or JFFS2 is a log-structured file system for use with flash memory devices. It is the successor to JFFS. JFFS2 has been included in the Linux kernel since the 2.4.10 release. JFFS2 is also available for a couple of bootloaders like Das U-Boot, Open...

    .
  • LogFS
    LogFS
    LogFS is a Linux log-structured and scalable flash file system, intended for use on large devices of flash memory. It is written by Jörn Engel and in part sponsored by the CE Linux Forum....

     is a scalable flash filesystem for Linux
    Linux
    Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

     that works on both raw flash media and block devices, intended to replace JFFS2
    JFFS2
    Journalling Flash File System version 2 or JFFS2 is a log-structured file system for use with flash memory devices. It is the successor to JFFS. JFFS2 has been included in the Linux kernel since the 2.4.10 release. JFFS2 is also available for a couple of bootloaders like Das U-Boot, Open...

    .
  • YAFFS
    YAFFS
    YAFFS was designed and written by Charles Manning, of Whitecliffs, New Zealand, for the company .Yaffs1 is the first version of this file system and works on NAND chips that have 512 byte pages + 16 byte spare areas. These older chips also generally allow 2 or 3 write cycles per page, which...

     is a raw NAND flash-specific file system for many operating systems (including Linux).

Disadvantages

The design rationale
Design Rationale
A Design Rationale is an explicit documentation of the reasons behind decisions made when designing a system or artifact. As initially developed by W.R...

 for log-structured file systems assumes that most reads will be optimized away by ever-enlarging memory caches. This assumption does not always hold:
  • On magnetic media—where seeks are relatively expensive—the log structure may actually make reads much slower, since it fragments files that conventional file systems normally keep contiguous with in-place writes.
  • On flash memory—where seek times are usually negligible—the log structure may not confer a worthwhile performance gain because write fragmentation has much less of an impact on write throughput. However many flash based devices cannot rewrite part of a block, and they must first perform a (slow) erase cycle of each block before being able to re-write, so by putting all the writes in one block, this can help performance as opposed to writes scattered into various blocks, each one of which must be copied into a buffer, erased, and written back.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK