Dd (Unix)
Encyclopedia
In computing
Computing
Computing is usually defined as the activity of using and improving computer hardware and software. It is the computer-specific part of information technology...

, dd is a common Unix
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...

 program whose primary purpose is the low-level copying and conversion of raw data. According to the manual page for Version 7 Unix, it will "convert and copy a file". It is used to copy a specified number of byte
Byte
The byte is a unit of digital information in computing and telecommunications that most commonly consists of eight bits. Historically, a byte was the number of bits used to encode a single character of text in a computer and for this reason it is the basic addressable element in many computer...

s or block
Block (data storage)
In computing , a block is a sequence of bytes or bits, having a nominal length . Data thus structured are said to be blocked. The process of putting data into blocks is called blocking. Blocking is used to facilitate the handling of the data-stream by the computer program receiving the data...

s, performing on-the-fly byte order conversions, as well as more esoteric EBCDIC
EBCDIC
Extended Binary Coded Decimal Interchange Code is an 8-bit character encoding used mainly on IBM mainframe and IBM midrange computer operating systems....

 to ASCII
ASCII
The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...

 conversions. It can also be used to copy regions of raw device files, for example backing up the boot sector
Boot sector
A boot sector or boot block is a region of a hard disk, floppy disk, optical disc, or other data storage device that contains machine code to be loaded into random-access memory by a computer system's built-in firmware...

 of a hard disk
Hard disk
A hard disk drive is a non-volatile, random access digital magnetic data storage device. It features rotating rigid platters on a motor-driven spindle within a protective enclosure. Data is magnetically read from and written to the platter by read/write heads that float on a film of air above the...

, or to read fixed amounts of data from special files like /dev/zero
/dev/zero
/dev/zero is a special file in Unix-like operating systems that provides as many null characters as are read from it. One of the typical uses is to provide a character stream for initializing data storage.-Function:...

 or /dev/random
/dev/random
In Unix-like operating systems, /dev/random is a special file that serves as a random number generator or as a pseudorandom number generator. It allows access to environmental noise collected from device drivers and other sources. Not all operating systems implement the same semantics for /dev/random...

.

The name dd may stand for "data" or "disk duplication". It is jokingly said to stand for "disk destroyer", "data destroyer", "death and destruction", or "delete data", since when used for low-level operations on hard disks, a small mistake, such as reversing the if and of (input and output) parameters, could result in the loss of some or all data on a disk.

The syntax of dd is likely inspired from DD found in IBM JCL, and the command's syntax is meant to be reminiscent of this; in JCL, "DD" stands for Data Description. The Jargon File
Jargon File
The Jargon File is a glossary of computer programmer slang. The original Jargon File was a collection of terms from technical cultures such as the MIT AI Lab, the Stanford AI Lab and others of the old ARPANET AI/LISP/PDP-10 communities, including Bolt, Beranek and Newman, Carnegie Mellon...

 states that it is rumored to have been based on IBM's JCL
Job Control Language
Job Control Language is a scripting language used on IBM mainframe operating systems to instruct the system on how to run a batch job or start a subsystem....

, and the syntax may have been a joke.

Usage

The command line syntax of dd is significantly different from most other Unix programs, and because of its ubiquity it is resistant to recent attempts to enforce a common syntax for all command line tools. Generally, dd uses an option=value format, whereas most Unix programs use either -option value or --option=value format. Also, the input is specified using the "if" (from input file) option, while most programs simply take the name by itself.

Usage varies across different operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

s. Also, certain features of dd will depend on the computer system cababilities, such as dd's ability to implement an option for direct memory access. Sending a SIGINFO
SIGINFO
On some Unix-like platforms, SIGINFO is the signal sent to computer programs when a status request is received from the keyboard. The symbolic constant for SIGINFO is defined in the header file signal.h...

 signal (or a USR1 signal on Linux) to a running dd process makes it print I/O statistics to standard error and then continue copying. Dd can read standard input from the keyboard. When EOF
EOF
EOF may refer to:*End-of-file, the computing term for an end-of-file condition or its tangible indication*Empirical orthogonal functions, a statistical technique for simplifying a dataset*Enterprise Objects Framework, a product from Apple Computer...

 (end of file) is read, dd will exit. Signals and EOF are determined by the software. For example, Unix tools ported to Windows vary as to the EOF: Cygwin
Cygwin
Cygwin is a Unix-like environment and command-line interface for Microsoft Windows. Cygwin provides native integration of Windows-based applications, data, and other system resources with applications, software tools, and data of the Unix-like environment...

 uses (the usual, Unix EOF) and MKS Toolkit
MKS Toolkit
MKS Toolkit is a software package produced and maintained by MKS Inc. that provides a Unix-like environment for scripting, connectivity and porting Unix and Linux software to both 32- and 64-bit Microsoft Windows systems. It was originally created for MS-DOS....

 uses (the usual, Windows EOF).

In compliance with the Unix philosophy
Unix philosophy
The Unix philosophy is a set of cultural norms and philosophical approaches to developing software based on the experience of leading developers of the Unix operating system.-McIlroy: A Quarter Century of Unix:...

, dd does one thing well. Unlike a sophisticated and highly abstracted utility, dd has no algorithm other than in the low-level decisions of the user concerning how to vary the run options. Often the options are changed for each run of dd in a multi-step process to solve a computer problem.

Output messages

The GNU variant of dd as supplied with Linux does not describe the format of the messages displayed on stdout on completion, however these are described by other implementations e.g. that with BSD.

Each of the "Records in" and "Records out" lines shows the number of complete blocks transferred + the number of partial blocks, e.g. because the physical medium ended before a complete block was read.

Block size

Block
Block (data storage)
In computing , a block is a sequence of bytes or bits, having a nominal length . Data thus structured are said to be blocked. The process of putting data into blocks is called blocking. Blocking is used to facilitate the handling of the data-stream by the computer program receiving the data...

 size is a crucial operating factor. Each run of dd will use one set of block sizes. There are block
Disk sector
In computer disk storage, a sector is a subdivision of a track on a magnetic disk or optical disc. Each sector stores a fixed amount of user data. Traditional formatting of these storage media provides space for 512 bytes or 2048 bytes of user-accessible data per sector...

 sizes for input and output. Block sizes can adapt dd to the realm of its application, and to the phase of an operation involving many runs of dd. An input block size is ibs, but bs will override ibs. An output block size will depend on obs, and cbs, and sync will pad to comply with cbs.

For example, in data recovery in an area of errors on a hard drive, the most bytes will be recovered by using a small block size; for the greatest speed a large block size is chosen according to (a point of diminishing returns concerning) the system it runs on. If the transfer uses a network, dd can operate using a suitable block size depending on congestion levels.

Some implementations understand the letter x as a multiplication operator in the block size and count parameters:
dd bs=2x80x18b if=/dev/fd0 of=floppy.image

where the "b" suffix indicates that the units are 512-byte blocks. Unix block devices use this as their allocation unit by default.

For the value of bs field, following decimal number can be suffixed:
w means 2
b means 512
k means 1024
M specifies multiplication by 10242
G specifies multiplication by 10243


Hence bs=2x80x18b means 2 × 80 × 18 × 512 = 1474560 which is the exact size of a 1440 KiB floppy disk
Floppy disk
A floppy disk is a disk storage medium composed of a disk of thin and flexible magnetic storage medium, sealed in a rectangular plastic carrier lined with fabric that removes dust particles...

.

Progress Information

dd is a silent tool which is very useful for scripting. However, if the progress is to be seen, use the following command on a GNU/Linux machines.
In a different terminal obtain the pid of the dd process by doing

ps -a

You may get a output like

18255 pts/5 00:00:00 ssh

24084 pts/2 00:00:04 dd

24334 pts/4 00:00:00 ps




To send a USR1 signal to dd, continue the following:

sudo kill -USR1 24084

In the terminal where dd is running you will see its output, something like:

349389+0 records in

349389+0 records out

1431097344 bytes (1.4 GB) copied, 935.624 s, 1.5 MB/s

One can do this as many as times as required to see the continuous progress.

Data transfer

dd can duplicate data across files, devices, partitions and volumes. The data may be input or output to and from any of these; but there are important differences concerning the output when going to a partition. Also, during the transfer, the data can be modified using the conv options to suit the medium.

An attempt to copy the entire disk using cp
Cp (Unix)
cp is a UNIX command used to copy a file. Files can be copied either to the same directory or to a completely different directory, possibly on a different file system or hard disk drive. If the file is copied to the same directory, the new file must have a different name to the original; in all...

 may omit the final block if it is an unexpected length; whereas dd may succeed. The source and destination disks should have the same size.
Data Transfer forms of dd
dd if=/dev/sr0 of=myCD.iso bs=2048 conv=noerror,sync create an ISO
ISO image
An ISO image is an archive file of an optical disc, composed of the data contents of every written sector of an optical disc, including the optical disc file system...

 disk image
Disk image
A disk image is a single file or storage device containing the complete contents and structure representing a data storage medium or device, such as a hard drive, tape drive, floppy disk, CD/DVD/BD, or USB flash drive, although an image of an optical disc may be referred to as an optical disc image...

 from a CD-ROM.
dd if=/dev/sda2 of=/dev/sdb2 bs=4096 conv=noerror Clone
Disk cloning
Disk cloning is the process of copying the contents of one computer hard disk to another disk or to an "image" file. Often, the contents of the first disk are written to an image file as an intermediate step, and the second disk is loaded with the contents of the image...

 one partition to another
dd if=/dev/ad0 of=/dev/ad1 bs=1M conv=noerror Clone a hard disk "ad0" to "ad1".


The noerror means to keep going if there is an error. The sync option means to pad the output blocks

Master boot record

It is possible to repair a master boot record. It can be transferred to and from a repair file.
To duplicate the first two sectors of a floppy drive:
dd if=/dev/fd0 of=MBRboot.img bs=512 count=2

To create an image of the entire master boot record
Master boot record
A master boot record is a type of boot sector popularized by the IBM Personal Computer. It consists of a sequence of 512 bytes located at the first sector of a data storage device such as a hard disk...

 (including the partition table
Partition table
The term partition table is most commonly associated with partition table but it may be used generically to refer to other "formats" that divide a disk drive into partitions, such as: GUID Partition Table, Apple partition map, or BSD disklabel.An alternative term to generically refer to partition...

):
dd if=/dev/sda of=MBR.img bs=512 count=1

To create an image of only the boot code of the master boot record
Master boot record
A master boot record is a type of boot sector popularized by the IBM Personal Computer. It consists of a sequence of 512 bytes located at the first sector of a data storage device such as a hard disk...

 (without the partition table
Partition table
The term partition table is most commonly associated with partition table but it may be used generically to refer to other "formats" that divide a disk drive into partitions, such as: GUID Partition Table, Apple partition map, or BSD disklabel.An alternative term to generically refer to partition...

):
dd if=/dev/sda of=MBR_boot.img bs=446 count=1

Data modification

dd can modify data in place.

Overwrite the first 512 bytes of a file with null bytes:

dd if=/dev/zero of=path/to/file bs=512 count=1 conv=notrunc

The notrunc conversion option means do not truncate the output file — that is, if the output file already exists, just replace the specified bytes and leave the rest of the output file alone. Without this option, dd would create an output file 512 bytes long.

To duplicate a disk partition as a disk image file on a different partition:

dd if=/dev/sdb2 of=partition.image bs=4096 conv=noerror

Disk wipe

For security reasons, it is necessary to have a disk wipe of the discarded device.

To check to see if a drive has data on it, send the output to standard out.

dd if=/dev/sda

To wipe a disk, first, consider the operation that would create a 1 GiB file containing only zeros (bs specifies block size, count the number of blocks):

dd if=/dev/zero of=file1G.tmp bs=1M count=1024

Count is the number of reads dd does. Multiplying 1M times 1024 gives us 1 GiB.

Now here are ways to use dd to wipe a disk:


dd if=/dev/urandom of=/dev/hda # wipe an entire disk with random data
dd if=/dev/zero of=/dev/sda # zero out a drive:


The output may be piped to various other Unix utilities in order to facilitate the report.

Data recovery

The history of open-source software
Open-source software
Open-source software is computer software that is available in source code form: the source code and certain other rights normally reserved for copyright holders are provided under a software license that permits users to study, change, improve and at times also to distribute the software.Open...

 (OSS) for data recovery
Data recovery
Data recovery is the process of salvaging data from damaged, failed, corrupted, or inaccessible secondary storage media when it cannot be accessed normally. Often the data are being salvaged from storage media such as internal or external hard disk drives, solid-state drives , USB flash drive,...

 and restoration of files, drives, and partitions started with GNU dd in 1984, with one block size per dd process, and no recovery algorithm other than the user's interactive session running one form of dd after another. Then a C program was authored Oct. 1999 called dd_rescue. It has two block sizes in its algorithm. But the author of the 2003 shell script dd_rhelp that enhances dd_rescue's data recovery algorithm, now recommends GNU ddrescue, a C++ program that published in 2004 and is now in most Linux distributions. GNU ddrescue has the most sophisticated block-size-changing algorithm available in OSS. (The names ddrescue and dd_rescue are similar, yet they are very different programs. Still, the Debian
Debian
Debian is a computer operating system composed of software packages released as free and open source software primarily under the GNU General Public License along with other free software licenses. Debian GNU/Linux, which includes the GNU OS tools and Linux kernel, is a popular and influential...

 Linux distribution packages dd_rescue as "ddrescue", and packages the GNU ddrescue as "gdrescue" or as "gddrescue").

GNU ddrescue is stable and safe. Here is an untested rescue using 3 of ddrescue's 24 options:

admin$> ddrescue -n /dev/old_disk /dev/new_disk # quickly grab large error-free areas, then stop
admin$> ddrescue -d -r1 /dev/old_disk /dev/new_disk # work with direct disk access on error areas


Another open source program called savehd7 uses a sophisticated algorithm, but it also requires the installation of its own programming-language interpreter.

Miscellaneous uses

To make drive benchmark test and analyze the sequential read and write performance for 1024 byte blocks :

dd if=/dev/zero bs=1024 count=1000000 of=file_1GB
dd if=file_1GB of=/dev/null bs=64k

To make a file of 100 random bytes:
dd if=/dev/urandom of=myrandom bs=100 count=1

To convert a file to uppercase:
dd if=filename of=filename1 conv=ucase

Create a 1 GiB sparse file
Sparse file
In computer science, a sparse file is a type of computer file that attempts to use file system space more efficiently when blocks allocated to the file are mostly empty. This is achieved by writing brief information representing the empty blocks to disk instead of the actual "empty" space which...

 or resize an existing file to 1 GiB without overwriting:
dd if=/dev/zero of=mytestfile.out bs=1 count=0 seek=1G

Limitations

Seagate
Seagate Technology
Seagate Technology is one of the world's largest manufacturers of hard disk drives. Incorporated in 1978 as Shugart Technology, Seagate is currently incorporated in Dublin, Ireland and has its principal executive offices in Scotts Valley, California, United States.-1970s:On November 1, 1979...

 documentation warns, "Certain disc utilities, such as DD, which depend on low-level disc access may not support 48-bit
LBA
Logical block addressing
Logical block addressing is a common scheme used for specifying the location of blocks of data stored on computer storage devices, generally secondary storage systems such as hard disks....

s until they are updated." Using ATA harddrives over 128 GiB requires 48-bit LBA. However, in Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

, dd uses the kernel to read or write to raw device files. Support for 48-bit LBA has been present since version 2.4.23 of the kernel.

See also

  • List of Unix programs
  • Backup
    Backup
    In information technology, a backup or the process of backing up is making copies of data which may be used to restore the original after a data loss event. The verb form is back up in two words, whereas the noun is backup....

  • Disk cloning
    Disk cloning
    Disk cloning is the process of copying the contents of one computer hard disk to another disk or to an "image" file. Often, the contents of the first disk are written to an image file as an intermediate step, and the second disk is loaded with the contents of the image...

  • Disk image
    Disk image
    A disk image is a single file or storage device containing the complete contents and structure representing a data storage medium or device, such as a hard drive, tape drive, floppy disk, CD/DVD/BD, or USB flash drive, although an image of an optical disc may be referred to as an optical disc image...

  • RaWrite
  • Disk Copy
    Disk Copy
    Disk Copy was the default utility for handling disk images in System 7 through Mac OS X 10.2 . In later versions of Mac OS X it has been replaced by DiskImageMounter for mounting the images and Disk Utility for creating them.Although the last official public release of Disk Copy for Mac OS 9 was...

  • Forensics (DD) Dcfldd

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK