Universally Unique Identifier
Encyclopedia
A universally unique identifier (UUID) is an identifier standard used in software construction, standardized by the Open Software Foundation
Open Software Foundation
The Open Software Foundation was a not-for-profit organization founded in 1988 under the U.S. National Cooperative Research Act of 1984 to create an open standard for an implementation of the UNIX operating system.-History:...

 (OSF) as part of the Distributed Computing Environment
Distributed Computing Environment
The Distributed Computing Environment is a software system developed in the early 1990s by a consortium that included Apollo Computer , IBM, Digital Equipment Corporation, and others. The DCE supplies a framework and toolkit for developing client/server applications...

 (DCE).

The intent of UUIDs is to enable distributed systems to uniquely identify information without significant central coordination. In this context the word unique should be taken to mean "practically unique" rather than "guaranteed unique". Since the identifiers have a finite size it is possible for two differing items to share the same identifier. The identifier size and generation process need to be selected so as to make this sufficiently improbable in practice. Anyone can create a UUID and use it to identify something with reasonable confidence that the same identifier will never be unintentionally created by anyone to identify something else. Information labeled with UUIDs can therefore be later combined into a single database without needing to resolve name conflicts.

One widespread use of this standard is in Microsoft's
Microsoft
Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...

 globally unique identifier
Globally Unique Identifier
A globally unique identifier is a unique reference number used as an identifier in computer software. The term GUID also is used for Microsoft's implementation of the Universally unique identifier standard....

s (GUIDs). Other significant uses include Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

's ext2
Ext2
The ext2 or second extended filesystem is a file system for the Linux kernel. It was initially designed by Rémy Card as a replacement for the extended file system ....

/ext3
Ext3
The ext3 or third extended filesystem is a journaled file system that is commonly used by the Linux kernel. It is the default file system for many popular Linux distributions, including Debian...

 filesystem, LUKS encrypted partitions, GNOME
GNOME
GNOME is a desktop environment and graphical user interface that runs on top of a computer operating system. It is composed entirely of free and open source software...

, KDE
KDE
KDE is an international free software community producing an integrated set of cross-platform applications designed to run on Linux, FreeBSD, Microsoft Windows, Solaris and Mac OS X systems...

, and Mac OS X
Mac OS X
Mac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...

, all of which use implementations derived from the uuid library found in the e2fsprogs
E2fsprogs
e2fsprogs is a set of utilities for maintaining the ext2, ext3 and ext4 file systems. Since those file systems are often the default for Linux distributions, it is commonly considered to be essential software....

 (Ext2 Filesystems Utilities) package.

UUIDs are documented as part of ISO
International Organization for Standardization
The International Organization for Standardization , widely known as ISO, is an international standard-setting body composed of representatives from various national standards organizations. Founded on February 23, 1947, the organization promulgates worldwide proprietary, industrial and commercial...

/IEC
International Electrotechnical Commission
The International Electrotechnical Commission is a non-profit, non-governmental international standards organization that prepares and publishes International Standards for all electrical, electronic and related technologies – collectively known as "electrotechnology"...

 11578:1996 "Information technology
Information technology
Information technology is the acquisition, processing, storage and dissemination of vocal, pictorial, textual and numerical information by a microelectronics-based combination of computing and telecommunications...

 – Open Systems Interconnection
Open Systems Interconnection
Open Systems Interconnection is an effort to standardize networking that was started in 1977 by the International Organization for Standardization , along with the ITU-T.-History:...

 – Remote Procedure Call
Remote procedure call
In computer science, a remote procedure call is an inter-process communication that allows a computer program to cause a subroutine or procedure to execute in another address space without the programmer explicitly coding the details for this remote interaction...

 (RPC)" and more recently in ITU-T Rec. X.667 | ISO
International Organization for Standardization
The International Organization for Standardization , widely known as ISO, is an international standard-setting body composed of representatives from various national standards organizations. Founded on February 23, 1947, the organization promulgates worldwide proprietary, industrial and commercial...

/IEC
International Electrotechnical Commission
The International Electrotechnical Commission is a non-profit, non-governmental international standards organization that prepares and publishes International Standards for all electrical, electronic and related technologies – collectively known as "electrotechnology"...

 9834-8:2005. The IETF
Internet Engineering Task Force
The Internet Engineering Task Force develops and promotes Internet standards, cooperating closely with the W3C and ISO/IEC standards bodies and dealing in particular with standards of the TCP/IP and Internet protocol suite...

 has published Standards Track RFC 4122 that is technically equivalent with ITU-T Rec. X.667 | ISO/IEC 9834-8.

Definition

A UUID is a 16-byte
Byte
The byte is a unit of digital information in computing and telecommunications that most commonly consists of eight bits. Historically, a byte was the number of bits used to encode a single character of text in a computer and for this reason it is the basic addressable element in many computer...

 (128-bit
Bit
A bit is the basic unit of information in computing and telecommunications; it is the amount of information stored by a digital device or other physical system that exists in one of two possible distinct states...

) number. The number of theoretically possible UUIDs is therefore about 3 × 1038. In its canonical form, a UUID is represented by 32 hexadecimal
Hexadecimal
In mathematics and computer science, hexadecimal is a positional numeral system with a radix, or base, of 16. It uses sixteen distinct symbols, most often the symbols 0–9 to represent values zero to nine, and A, B, C, D, E, F to represent values ten to fifteen...

 digits, displayed in 5 groups separated by hyphens, in the form 8-4-4-4-12 for a total of 36 characters (32 digits and 4 hyphens). For example:
550e8400-e29b-41d4-a716-446655440000


There are 340,282,366,920,938,463,463,374,607,431,768,211,456 possible UUIDs (16 to the 32nd power).

A UUID may also be used with a specific identifier intentionally used repeatedly to identify the same thing in different contexts. For example, in Microsoft
Microsoft
Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...

's Component Object Model
Component Object Model
Component Object Model is a binary-interface standard for software componentry introduced by Microsoft in 1993. It is used to enable interprocess communication and dynamic object creation in a large range of programming languages...

, every component must implement the IUnknown
IUnknown
In programming, the IUnknown interface is the fundamental interface in the Component Object Model . The published mandates that COM objects must minimally implement this interface...

 interface, which is done by creating a UUID representing IUnknown. In all cases wherever IUnknown is used, whether it is being used by a process trying to access the IUnknown interface in a component, or by a component implementing the IUnknown interface, it is always referenced by the same identifier: 00000000-0000-0000-C000-000000000046.

Variants and versions

The variant indicates the layout of the UUID. The UUID specification covers one particular variant. Other variants are reserved or exist for backward compatibility reasons (e.g. for values assigned before the UUID specification was produced). An example of a UUID that is a different variant is the nil UUID, which is a UUID that has all 128 bits set to zero.

In the canonical representation, xxxxxxxx-xxxx-Mxxx-Nxxx-xxxxxxxxxxxx, the most significant bits of N indicates the variant (depending on the variant; one, two or three bits are used). The variant covered by the UUID specification is indicated by the two most significant bits of N being 1 0 (i.e. the hexadecimal N will always be 8, 9, a, or b).

In the variant covered by the UUID specification, there are five versions. For this variant, the four bits of M indicates the UUID version (i.e. the hexadecimal M will either be 1, 2, 3, 4, or 5).

Version 1 (MAC address)

Conceptually, the original (version 1) generation scheme for UUIDs was to concatenate the UUID version with the MAC address
MAC address
A Media Access Control address is a unique identifier assigned to network interfaces for communications on the physical network segment. MAC addresses are used for numerous network technologies and most IEEE 802 network technologies, including Ethernet...

 of the computer that is generating the UUID, and with the number of 100-nanosecond
Nanosecond
A nanosecond is one billionth of a second . One nanosecond is to one second as one second is to 31.7 years.The word nanosecond is formed by the prefix nano and the unit second. Its symbol is ns....

 intervals since the adoption of the Gregorian calendar
Gregorian calendar
The Gregorian calendar, also known as the Western calendar, or Christian calendar, is the internationally accepted civil calendar. It was introduced by Pope Gregory XIII, after whom the calendar was named, by a decree signed on 24 February 1582, a papal bull known by its opening words Inter...

 in the West. This scheme has been criticized in that it is not sufficiently "opaque"; it reveals both the identity of the computer that generated the UUID and the time at which it did so.

Version 2 (DCE Security)

Version 2 UUIDs are similar to Version 1 UUIDs, with the upper byte of the clock sequence replaced by the identifier for a "local domain" (typically either the "POSIX
POSIX
POSIX , an acronym for "Portable Operating System Interface", is a family of standards specified by the IEEE for maintaining compatibility between operating systems...

 UID domain" or the "POSIX GID domain") and the first 4 bytes of the timestamp replaced by the user's POSIX UID
UID
UID may refer to:* Unique Identification Number later renamed as Aadhaar number, an initiative of Unique Identification Authority of India of the Indian government to create a unique ID for every Indian resident....

 or GID
GID
GID might be an acronym or an abbreviation for:* Walter Halvorsen*Gender in Development*Gender identity disorder*General Improvement District*Gender, Institutions and Development Data Base...

 (with the "local domain" identifier indicating which it is).

Version 3 (MD5 hash)

Version 3 UUIDs use a scheme deriving a UUID via MD5
MD5
The MD5 Message-Digest Algorithm is a widely used cryptographic hash function that produces a 128-bit hash value. Specified in RFC 1321, MD5 has been employed in a wide variety of security applications, and is also commonly used to check data integrity...

 from a URL
Uniform Resource Locator
In computing, a uniform resource locator or universal resource locator is a specific character string that constitutes a reference to an Internet resource....

, a
fully qualified domain name, an object identifier
Object identifier
In computing, an object identifier or OID is an identifier used to name an object . Structurally, an OID consists of a node in a hierarchically-assigned namespace, formally defined using the ITU-T's ASN.1 standard. Successive numbers of the nodes, starting at the root of the tree, identify each...

, a distinguished name (DN
X.500
X.500 is a series of computer networking standards covering electronic directory services. The X.500 series was developed by ITU-T, formerly known as CCITT, and first approved in 1988. The directory services were developed in order to support the requirements of X.400 electronic mail exchange and...

 as used in Lightweight Directory Access Protocol
Lightweight Directory Access Protocol
The Lightweight Directory Access Protocol is an application protocol for accessing and maintaining distributed directory information services over an Internet Protocol network...

), or on names in
unspecified namespaces. Version 3 UUIDs have the form xxxxxxxx-xxxx-3xxx-xxxx-xxxxxxxxxxxx with hexadecimal digits x.

To determine the version 3 UUID of a given name, the UUID of the namespace, e.g. 6ba7b810-9dad-11d1-80b4-00c04fd430c8 for a domain, is transformed to a string of bytes corresponding to its hexadecimal digits, concatenated with the input name, hashed with MD5 yielding 128 bits. Six bits are replaced by fixed values, four of these bits indicate the version, 0011 for version 3. Finally the fixed hash is transformed back into the hexadecimal form with hyphens separating the parts relevant in other UUID versions.

Version 4 (random)

Version 4 UUIDs use a scheme relying only on random number
Random number
Random number may refer to:* A number generated for or part of a set exhibiting statistical randomness.* A random sequence obtained from a stochastic process.* An algorithmically random sequence in algorithmic information theory....

s. This algorithm sets the version number as well as two reserved bits. All other bits are set using a random or pseudorandom data source. Version 4 UUIDs have the form xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx where x is any hexadecimal digit and y is one of 8, 9, A, or B. e.g. f47ac10b-58cc-4372-a567-0e02b2c3d479.

Version 5 (SHA-1 hash)

Version 5 UUIDs use a scheme with SHA-1 hashing; otherwise it is the same idea as in version 3. RFC 4122 states that version 5 is preferred over version 3 name based UUIDs without giving a reason. Note that the 160 bit SHA-1 hash is truncated to 128 bits to make the length work out. An erratum addresses the example in appendix B of RFC 4122.

Implementations

ActionScript : CASA Lib provides a Version 4 UUID function as part of the StringUtil class. Adobe Flex also provides a UUID implementation with the UIDUtil class.
C : libuuid is part of the e2fsprogs
E2fsprogs
e2fsprogs is a set of utilities for maintaining the ext2, ext3 and ext4 file systems. Since those file systems are often the default for Linux distributions, it is commonly considered to be essential software....

 package. The OSSP project provides a UUID library.
C++ : ooid implements a C++ UUID class. QUuid is part of the C++ Qt framework. Boost.Uuid is a header-only implementation under a non-reciprocal Open Source license.
Caché ObjectScript : UUID Version 4 implementation for Caché ObjectScript
Caché ObjectScript
Caché ObjectScript is a part of the Caché database system sold by Intersystems Corp. The language is a functional superset of the ANSI-standard M programming language. MUMPS programmers can run existing MUMPS routines under Caché with little or no change...

.
CakePHP : CakePHP will automatically generate UUIDs for new records if you specify a table's primary key as data type CHAR(36).
Cocoa/Carbon (Mac OS X) : The Core Foundation
Core Foundation
Core Foundation is a C application programming interface in Mac OS X & iOS, and is a mix of low-level routines and wrapper functions...

 class CFUUIDRef is used to produce and store UUIDs, as well as to convert them to and from CFString/NSString representations.
CodeGear RAD Studio (Delphi/C++ Builder) : A new GUID can be generated by pressing Ctrl+Shift+G. For runtime functions see the "Free Pascal & Lazarus IDE" section.
ColdFusion : The createUUID function provides a UUID in all versions, however the format generated is in 4 segments instead of 5 xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxx (8-4-4-16).
Common Lisp : Two libraries are available to create UUIDs according to RFC 4122. A generalized library for creation of UUIDs (v1, v3, v4 and v5). Unicly which is optimized for creation of UUIDs (v3, v4, and v5) and offers an extended interface for converting among different representations and interrogating UUID equivalence across these different representations.
CouchDB : If not provided, CouchDB sets the document ID for each document to be a UUID
D : The Tango standard library includes a module to create UUIDs (v3, v4, and v5) according to RFC 4122.
Eiffel : A library is available to create UUIDs Generates uuids according to RFC 4122, Variant 1 0, Version 4. Source available at Eiffel UUID library
Firebird Server : Firebird
Firebird (database server)
Firebird is an open source SQL relational database management system that runs on Linux, Windows, and a variety of Unix. The database forked from Borland's open source edition of InterBase in 2000, but since Firebird 1.5 the code has been largely rewritten ....

 has gen_uuid from version 2.1 and uuid_to_char and char_to_uuid from version 2.5 as built-in functions.
Free Pascal & Lazarus IDE : In Free Pascal
Free Pascal
Free Pascal Compiler is a free Pascal and Object Pascal compiler.In addition to its own Object Pascal dialect, Free Pascal supports, to varying degrees, the dialects of several other compilers, including those of Turbo Pascal, Delphi, and some historical Macintosh compilers...

 there is a class called TGUID that holds the structure of a UUID. Also in the SysUtils.pas unit there are methods to create, compare and convert UUID's. They are CreateGUID, GUIDToString and IsEqualGUID. In the Lazarus IDE
Lazarus (software)
Lazarus is a free cross-platform IDE which provides a Delphi-like development experience for Pascal and Object Pascal developers. It is developed for, and supported by, the Free Pascal compiler. Since early 2008, Lazarus has been available for Microsoft Windows, several Linux distributions,...

 you can also generate a UUID by pressing Ctrl+Shift+G.
Haskell : The package uuid directly implements most of RFC 4122. The package supports generation (v1, v3, v4 and v5) as well as serialization to and from string and binary formats. The package system-uuid provides bindings to the native UUID generators on Windows, Linux and Mac OS X.
Java : The J2SE
Java Platform, Standard Edition
Java Platform, Standard Edition or Java SE is a widely used platform for programming in the Java language. It is the Java Platform used to deploy portable applications for general use...

 5.0 release of Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

 provides a class that will produce 128-bit UUIDs, although it only implements version 3 and 4 generation methods, not the original method (due to lack of means to access MAC addresses using pure Java before version 6). The API documentation for the class refers to ISO/IEC 11578:1996. Open source implementations supporting MAC addresses on several common operating systems are UUID – generate UUIDs (or GUIDs) in Java and Java Uuid Generator (JUG).
Javascript : Broofa.com has implemented a JavaScript function which generates version 4 UUIDs as defined in the RFC 4122 specification. An open source library UUID.js, which is available under the MIT license, generates version 4 and version 1 UUIDs according to RFC 4122.
Lasso : A custom tag for Lassoscript by Douglas Burchard, and an LJAPI-module by Steffan A. Cline.
Lua : There is a Lua module by Luiz Henrique de Figueiredo.
Linux : Command line utility uuidgen is commonly available from e2fsprogs package (RedHat). There is also a tool called simply "uuid" available, which has the same functionality.
Mac OS X : Command line utility uuidgen is available. In the Terminal application, type: uuidgen
MySQL : MySQL
MySQL
MySQL officially, but also commonly "My Sequel") is a relational database management system that runs as a server providing multi-user access to a number of databases. It is named after developer Michael Widenius' daughter, My...

 provides a UUID function.
.NET Framework : The .NET Framework
.NET Framework
The .NET Framework is a software framework that runs primarily on Microsoft Windows. It includes a large library and supports several programming languages which allows language interoperability...

 also provides a structure System.Guid to generate and manipulate 128-bit UUIDs.
Objective Caml (OCaml) : The uuidm library implements universally unique identifiers version 3, 5 (name based with MD5, SHA-1 hashing) and 4 (random based) according to RFC 4122.
Oracle Database : The Oracle Database
Oracle Database
The Oracle Database is an object-relational database management system produced and marketed by Oracle Corporation....

 provides a function SYS_GUID to generate unique identifiers.
Perl: The Data::UUID and Data::GUID modules from CPAN
CPAN
CPAN, the Comprehensive Perl Archive Network, is an archive of nearly 100,000 modules of software written in Perl, as well as documentation for it. It has a presence on the World Wide Web at and is mirrored worldwide at more than 200 locations...

 can be used to create UUIDs. The UUID::Tiny module is a lightweight, low dependency Pure Perl module for UUID creation and testing.
PHP : In PHP
PHP
PHP is a general-purpose server-side scripting language originally designed for web development to produce dynamic web pages. For this purpose, PHP code is embedded into the HTML source document and interpreted by a web server with a PHP processor module, which generates the web page document...

 there are several modules for creating UUIDs.
KohanaPHP : The Kohana PHP Framework, supports the generation of version 3, 4, and 5 UUIDs according to RFC 4122 specifications using the UUID module.
PostgreSQL : PostgreSQL
PostgreSQL
PostgreSQL, often simply Postgres, is an object-relational database management system available for many platforms including Linux, FreeBSD, Solaris, MS Windows and Mac OS X. It is released under the PostgreSQL License, which is an MIT-style license, and is thus free and open source software...

 contains a uuid data type. Also various generation functions as part of the uuid-ossp contrib module.
Progress OpenEdge ABL : The GENERATE-UUID function in OpenEdge 10 provides a UUID which can be made printable using the GUID or BASE64-ENCODE functions.
Python : The uuid module (included in the standard library since Python
Python (programming language)
Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...

 2.5) creates UUIDs to RFC 4122.
Revolution/RunRev : The libUUID library A library that generates UUIDs of type 1 (time based), type 3 (name-based) and type 4 (random-based). Version 1.0. by Mark Smith. OSL 3.0
Ruby : There are several RFC4122 implementations for Ruby
Ruby (programming language)
Ruby is a dynamic, reflective, general-purpose object-oriented programming language that combines syntax inspired by Perl with Smalltalk-like features. Ruby originated in Japan during the mid-1990s and was first developed and designed by Yukihiro "Matz" Matsumoto...

, the most updated ones being Ruby-UUID (fork here), UUID and UUIDTools. Ruby 1.9 includes a built-in version 4 uuid generator (SecureRandom.uuid).
SQL Server : Transact-SQL (2000 and 2005) provides a function called NEWID to generate unique identifiers. SQL Server 2005 provides an additional function called NEWSEQUENTIALID which generates a new GUID
Globally Unique Identifier
A globally unique identifier is a unique reference number used as an identifier in computer software. The term GUID also is used for Microsoft's implementation of the Universally unique identifier standard....

 that is greater than any GUID
Globally Unique Identifier
A globally unique identifier is a unique reference number used as an identifier in computer software. The term GUID also is used for Microsoft's implementation of the Universally unique identifier standard....

 previously created by the NEWSEQUENTIALID function on a given computer.
Apache Solr : Solr
Solr
Solr is an open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document handling...

 contains a uuid data type.
Tcl : A Tcl
Tcl
Tcl is a scripting language created by John Ousterhout. Originally "born out of frustration", according to the author, with programmers devising their own languages intended to be embedded into applications, Tcl gained acceptance on its own...

 implementation is provided in the TclLib package.

Random UUID probability of duplicates

Randomly generated UUIDs like those generated by the class have 122 random bits. There are 128 bits altogether with 4 bits being used for the version ('Randomly generated UUID'), and 2 bits for the variant ('Leach-Salz'). With random UUIDs, the chance of two having the same value can be calculated using probability theory (Birthday paradox
Birthday paradox
In probability theory, the birthday problem or birthday paradox pertains to the probability that, in a set of n randomly chosen people, some pair of them will have the same birthday. By the pigeonhole principle, the probability reaches 100% when the number of people reaches 366. However, 99%...

). Using the approximation
these are the probabilities of an accidental clash after calculating n UUIDs, with x=2122:





nprobability
68,719,476,736 = 2360.0000000000000004 (4 × 10−16)
2,199,023,255,552 = 2410.0000000000004 (4 × 10−13)
70,368,744,177,664 = 2460.0000000004 (4 × 10−10)


To put these numbers into perspective, one's annual risk of being hit by a meteorite is estimated to be one chance in 17 billion, that means the probability is about 0.00000000006 (6 × 10−11), equivalent to the odds of creating a few tens of trillions of UUIDs in a year and having one duplicate. In other words, only after generating 1 billion UUIDs every second for the next 100 years, the probability of creating just one duplicate would be about 50%. The probability of one duplicate would be about 50% if every person on earth owns 600 million UUIDs.

However, these probabilities only hold when the UUIDs are generated using sufficient entropy. Otherwise the probability of duplicates may be significantly higher, since the statistical dispersion
Statistical dispersion
In statistics, statistical dispersion is variability or spread in a variable or a probability distribution...

 may be lower.

History

The initial design of DCE UUIDs was based on UUIDs as defined in the Network Computing System
Network Computing System
The Network Computing System was an implementation of the Network Computing Architecture . It was created at Apollo Computer in the 1980s...

, whose design was in turn inspired by the (64-bit) unique identifiers defined and used pervasively in Domain/OS
Domain/OS
Domain/OS is the operating system used by the Apollo/Domain line of workstations manufactured by Apollo Computer, Inc. during the late 1980s, as the successor to the one previously used, AEGIS. It was one of the early distributed operating systems...

, the operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

 designed by Apollo Computer
Apollo Computer
Apollo Computer, Inc., founded 1980 in Chelmsford, Massachusetts by William Poduska and others, developed and produced Apollo/Domain workstations in the 1980s. Along with Symbolics and Sun Microsystems, Apollo was one of the first vendors of graphical workstations in the 1980s...

, Inc.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK