UUHash
Encyclopedia
UUHash is a hash algorithm
Hash function
A hash function is any algorithm or subroutine that maps large data sets to smaller data sets, called keys. For example, a single integer can serve as an index to an array...

 employed by clients on the FastTrack network. It is employed for its ability to hash very large files in a very short period of time, even on older computers. However, this is achieved by only hashing a fraction of the file. This weakness makes it trivial to create a hash collision, allowing large sections to be completely altered without altering the checksum
Checksum
A checksum or hash sum is a fixed-size datum computed from an arbitrary block of digital data for the purpose of detecting accidental errors that may have been introduced during its transmission or storage. The integrity of the data can be checked at any later time by recomputing the checksum and...

.

This method is used by Kazaa
Kazaa
Kazaa Media Desktop started as a peer-to-peer file sharing application using the FastTrack protocol licensed by Joltid Ltd. and operated as Kazaa by Sharman Networks...

 and is exploited by anti-p2p
Peer-to-peer file sharing
P2P or Peer-to-peer file sharing allows users to download files such as music, movies, and games using a P2P software client that searches for other connected computers. The "peers" are computer systems connected to each other through internet. Thus, the only requirements for a computer to join...

 agencies to corrupt
Data corruption
Data corruption refers to errors in computer data that occur during writing, reading, storage, transmission, or processing, which introduce unintended changes to the original data...

 downloads.

How it works

UUHash will hash the first 300 kilobyte
Kilobyte
The kilobyte is a multiple of the unit byte for digital information. Although the prefix kilo- means 1000, the term kilobyte and symbol KB have historically been used to refer to either 1024 bytes or 1000 bytes, dependent upon context, in the fields of computer science and information...

s using MD5
MD5
The MD5 Message-Digest Algorithm is a widely used cryptographic hash function that produces a 128-bit hash value. Specified in RFC 1321, MD5 has been employed in a wide variety of security applications, and is also commonly used to check data integrity...

 and then apply a smallhash function (identical to the CRC32 checksum used by PNG) to 300 KB blocks at file offsets 2n MB with n being an integer incremented from 0 until the offset reaches end of file. Finally the last 300 KB of the file are hashed. If the last 300 KB of the file overlap with the last block of the 2n sequence this block is ignored in favor of the
file end block.

So, for example:
offset 1 MB, 300 KB hashed
offset 2 MB, 300 KB hashed
offset 4 MB, 300 KB hashed
offset 8 MB, 300 KB hashed
...
last 300 KB of file hashed


The 128 bit MD5 hash and the 32 bit smallhash are then concatenated yielding the 160 bit hash used to identify files on the FastTrack network.

The actual hash used on the FastTrack network is a concatenation of 128 bit MD5 of the first 300 KB of the file and a sparse 32 bit smallhash calculated in the way described above. The resulting 160 bits when encoded using Base64
Base64
Base64 is a group of similar encoding schemes that represent binary data in an ASCII string format by translating it into a radix-64 representation...

 become the UUHash.

Sig2Dat

The name UUHash derives from the sig2dat utility which creates URI
Uniform Resource Identifier
In computing, a uniform resource identifier is a string of characters used to identify a name or a resource on the Internet. Such identification enables interaction with representations of the resource over a network using specific protocols...

s referencing files on Kazaa. These URIs are of the form:
sig2dat://|File: surprise.mp3|Length:5845871Bytes|UUHash:=1LDYkHDl65OprVz37xN1VSo9b00=

Not considering the fact that this URI
Úri
Úriis a village and commune in the comitatus of Pest in Hungary....

 format is not RFC
Request for Comments
In computer network engineering, a Request for Comments is a memorandum published by the Internet Engineering Task Force describing methods, behaviors, research, or innovations applicable to the working of the Internet and Internet-connected systems.Through the Internet Society, engineers and...

 compliant, UUHash refers to the Base64
Base64
Base64 is a group of similar encoding schemes that represent binary data in an ASCII string format by translating it into a radix-64 representation...

-encoding of the hash and not the hash itself.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK