All Topics  
Snapshot (computer storage)

 

   Email Print
   Bookmark   Link






 

Snapshot (computer storage)



 
 
In computer file systems
File system

In computing, a file system is a method for store and organize computer files and the data they contain to make it easy to find and access them....
, a snapshot is a copy of a set of files and directories as they were at a particular point in the past. The term was coined as an analogy to that in photography
Snapshot (photography)

A snapshot is popularly defined as a photography that is "shot" spontaneously and quickly, most often without artistic or journalistic intent....
.

ll backup
Backup

In information technology, backup refers to making copies of data so that these additional copies may be used to restore the original after a data loss event....
 of a large data set may take a long time to complete. On multi-tasking or multi-user systems, there may be writes to that data while it is being backed up. This prevents the backup from being atomic and introduces a version skew that may result in data corruption.






Discussion
Ask a question about 'Snapshot (computer storage)'
Start a new discussion about 'Snapshot (computer storage)'
Answer questions from other users
Full Discussion Forum



Encyclopedia


In computer file systems
File system

In computing, a file system is a method for store and organize computer files and the data they contain to make it easy to find and access them....
, a snapshot is a copy of a set of files and directories as they were at a particular point in the past. The term was coined as an analogy to that in photography
Snapshot (photography)

A snapshot is popularly defined as a photography that is "shot" spontaneously and quickly, most often without artistic or journalistic intent....
.

Rationale

A full backup
Backup

In information technology, backup refers to making copies of data so that these additional copies may be used to restore the original after a data loss event....
 of a large data set may take a long time to complete. On multi-tasking or multi-user systems, there may be writes to that data while it is being backed up. This prevents the backup from being atomic and introduces a version skew that may result in data corruption. For example, if a user moves a file from a directory that has not yet been backed up into a directory that has already been backed up, then that file would be completely missing on the backup media. Version skew may also cause corruption with files which change their size or contents underfoot while being read.

One approach
Backup

In information technology, backup refers to making copies of data so that these additional copies may be used to restore the original after a data loss event....
 to safely backing up live data is to temporarily disable write access to data during the backup, either by stopping the accessing applications or by using the locking
Lock (computer science)

In computer science, a lock is a Synchronization mechanism for enforcing limits on access to a resource in an environment where there are many thread ....
 API provided by the operating system to enforce exclusive read access. This is tolerable for low-availability systems (on desktop computers and small workgroup servers, on which regular downtime
Downtime

The term downtime is used to refer to periods when a system is unavailable.Downtime or outage duration refers to a period of time that a system fails to provide or perform its primary function....
 is acceptable). High-availability 24/7
24/7

24/7 is an abbreviation which stands for "24 hours a day, 7 days a week", usually referring to a business or service available at all times without interruption....
 systems, however, cannot bear service stoppages.

To avoid downtime, high-availability systems may instead perform the backup on a snapshot—a read-only
Read-only

Read-only generally refers to something that can be read, but not written to or modified.In computing, read-only can mean:* Read-only memory , a type of storage media...
 copy of the data set frozen at a point in time—and allow applications to continue writing to their data. Most snapshot implementations are efficient and can create snapshots in O(1)
Big O notation

In mathematics, big O notation describes the asymptotic analysis of a function when the argument tends towards a particular value or infinity, usually in terms of simpler functions....
. In other words, the time and I/O needed to create the snapshot does not increase with the size of the data set, whereas the same for a direct backup is proportional to the size of the data set.

Read-write snapshots are sometimes called branching snapshots, because they implicitly create diverging versions of their data. Aside from backups and data recovery, read-write snapshots are frequently used in virtualization, sandboxing and virtual hosting setups because of their usefulness in managing changes to large sets of files.

Implementations


Volume managers

Some Unix systems have snapshot-capable logical volume managers
Logical volume management

In computer storage, logical volume management or LVM is a method of allocating space on mass storage devices that is more flexible than conventional partition schemes....
. These implement copy-on-write
Copy-on-write

Copy-on-write is an Optimization strategy used in computer programming. The fundamental idea is that if multiple callers ask for resources which are initially indistinguishable, you can give them pointers to the same resource....
 on entire block devices by copying changed blocks—just before they are to be overwritten—to other storage, thus preserving a self-consistent past image of the block device. Filesystems on this image can later be mounted as if it were on read-only media. Block-level snapshotting is almost always less space-efficient than direct file system support for snapshots.

File systems

Some file systems, such as WAFL
Write Anywhere File Layout

The Write Anywhere File Layout is a file system that supports large, high-performance Redundant array of independent disks arrays, quick restarts without lengthy consistency checks in the event of a crash or power failure, and growing the filesystem size quickly....
, fossil
Fossil (file system)

Fossil is the default file system in Plan 9 from Bell Labs. It serves the network protocol 9P and runs as a user space Daemon , like most Plan 9 file servers....
 for Plan 9 from Bell Labs
Plan 9 from Bell Labs

Plan 9 from Bell Labs is a distributed operating system, primarily used for research. It was developed as the research successor to Unix by the Computing Sciences Research Center at Bell Labs between the mid-1980s and 2002....
 or ODS-5, internally track old versions of files and make snapshots available through a special namespace
Namespace (computer science)

A namespace is an abstract container or environment created to hold a logical grouping of unique identifiers or symbols . An identifier defined in a namespace is associated with that namespace....
. Others, like UFS2, provide an operating system API for accessing file histories. In NTFS
NTFS

NTFS is the standard file system of Windows NT, including its later versions Windows 2000, Windows XP, Windows Server 2003, Windows Server 2008, Windows Vista, and Windows 7....
, access to snapshots is provided by the Volume Shadow-copying Service (VSS) in Windows XP
Windows XP

Windows XP is a line of operating systems produced by Microsoft for use on personal computers, including home and business desktops, laptop, and media centers....
 and Windows Server 2003
Windows Server 2003

Windows Server 2003 is a Server operating system produced by Microsoft. Introduced on 24 April 2003 as the successor to Windows 2000 Server, it is considered by Microsoft to be the cornerstone of its Windows Server System line of business server products....
 and Shadow Copy in Windows Vista
Windows Vista

Windows Vista is one member in a family of operating systems developed by Microsoft for use on personal computers, including home and business Desktop computer, laptops, Tablet PCs, and media center PCs....
. Snapshots have also been available in the NSS (Novell Storage Services
Novell Storage Services

Novell Storage Services is a file system used by the Novell NetWare operating system and recently ported to SUSE Linux. It has some unique features that make it especially useful for, but not limited to, setting up shared volumes on a file server in a Local Area Network....
) file system on NetWare since version 4.11, and more recently on Linux
Linux

Linux is a generic term referring to Unix-like computer operating systems based on the Linux kernel. Their development is one of the most prominent examples of free and open source software collaboration; typically all the underlying source code can be used, freely modified, and redistributed by anyone under the terms of the GNU GPL license...
 platforms in the Open Enterprise Server product.

Sun Microsystems
Sun Microsystems

Sun Microsystems, Inc. is a multinational corporation vendor of computers, computer components, computer software, and information technology services, founded on February 24, 1982....
 ZFS
ZFS

In computing, ZFS is a file system designed by Sun Microsystems for the Solaris Operating System. The features of ZFS include support for high storage capacities, integration of the concepts of filesystem and volume , Snapshot and copy-on-write clones, continuous integrity checking and automatic repair, RAID-Z and native NFSv4 ACLs....
 has a hybrid implementation which tracks read-write snapshots at the block level, but makes branched file sets nameable to user applications as "clones". ZFS is open source, and available for free download in OpenSolaris
OpenSolaris

File:Opensolaris-screenshot-2008-05.pngOpenSolaris is an open source operating system based on Sun Microsystems' Solaris . It is also the name of the project initiated by Sun to build a developer and user community around it....
, Linux
Linux

Linux is a generic term referring to Unix-like computer operating systems based on the Linux kernel. Their development is one of the most prominent examples of free and open source software collaboration; typically all the underlying source code can be used, freely modified, and redistributed by anyone under the terms of the GNU GPL license...
, BSD and Mac OS
Mac OS

Mac OS is the trademarked name for a series of graphical user interface-based operating systems developed by Apple Inc. for their Macintosh line of computer systems....
.

Time Machine, included in Apple's Mac OS X v10.5
Mac OS X v10.5

Mac OS X version 10.5 "Leopard" is the sixth Software version of Mac OS X, Apple Inc. desktop and server operating system for Apple Macintosh computers, and the successor to Mac OS X v10.4 "Tiger"....
 operating system
Operating system

An operating system is an interface between hardware and applications; it is responsible for the management and coordination of activities and the sharing of the limited resources of the computer....
, is not a snapshotting scheme but a system-level incremental backup service: it merely watches mounted volumes for changes and copies changed files periodically to a specially-designated volume. Apple may be working on integrating ZFS into Time Machine.

In databases

The SQL
SQL

SQL is a database computer language designed for the retrieval and management of data in relational database management systems , database schema creation and modification, and database object access control management....
 specification mandates four levels of transaction isolation. In the highest, SERIALIZABLE, a snapshot is implicitly created at the start of every transaction. The backup utilities for many popular SQL databases use this feature to generate self-consistent dumps of table data.

In virtualization

System emulators host a guest operating system in a virtual machine; some (including VMware
VMware

VMware, Inc. is a software developer of virtualization software. The company was founded in 1998 and is based in Palo Alto, California. The Company is majority owned by EMC Corporation ....
, Qemu
QEMU

QEMU is a central processing unit emulator that relies on dynamic binary translation to achieve a reasonable speed while being easy to port on new host CPU architectures....
 and Virtual PC
Virtual PC

Virtual PC can refer to:* A Virtual machine, the generic name for this kind of technology* Microsoft Virtual PC, a product from Microsoft...
) can perform whole-system snapshots by dumping the entire machine state to a backing file and redirecting future guest writes to a second file, which then acts as a copy-on-write table.

Other applications

Software transactional memory
Software transactional memory

In computer science, software transactional memory is a concurrency control mechanism analogous to database transactions for controlling access to shared memory in concurrent computing....
 is a scheme which applies the same concepts to data structures held only in memory.

External links