Watermark (data file)
Encyclopedia
A watermark stored in a data file refers to a method for ensuring data integrity
Data integrity
Data Integrity in its broadest meaning refers to the trustworthiness of system resources over their entire life cycle. In more analytic terms, it is "the representational faithfulness of information to the true state of the object that the information represents, where representational faithfulness...

 which combines aspects of data hashing
Hash function
A hash function is any algorithm or subroutine that maps large data sets to smaller data sets, called keys. For example, a single integer can serve as an index to an array...

 and digital watermarking
Digital watermarking
Digital watermarking is the process of embedding information into a digital signal which may be used to verify its authenticity or the identity of its owners, in the same manner as paper bearing a watermark for visible identification. In digital watermarking, the signal may be audio, pictures, or...

. Both are useful for tamper detection, though each has its own advantages and disadvantages.

Data hashing

A typical data hash will process an input file to produce an alphanumeric
Alphanumeric
Alphanumeric is a combination of alphabetic and numeric characters, and is used to describe the collection of Latin letters and Arabic digits or a text constructed from this collection. There are either 36 or 62 alphanumeric characters. The alphanumeric character set consists of the numbers 0 to...

 string
String (computer science)
In formal languages, which are used in mathematical logic and theoretical computer science, a string is a finite sequence of symbols that are chosen from a set or alphabet....

 unique to the data file. Should the file be modified, such as if one or more bit changes occur within this original file, the same hash process on the modified file will produce a different alphanumeric. Through this method, a trusted source
Trusted system
In the security engineering subspecialty of computer science, a trusted system is a system that is relied upon to a specified extent to enforce a specified security policy...

 can calculate the hash of an original data file and subscribers can verify the integrity of the data. The subscriber simply compares a hash of the received data file with the known hash from the trusted source. This can lead to two situations: the hash being the same or the hash being different.

If the hash results are the same, the systems involved can have an appropriate degree of confidence to the integrity of the received data. On the other hand, if the hash results are different, they can conclude that the received data file has been altered.

This process is common in P2P
Peer-to-peer file sharing
P2P or Peer-to-peer file sharing allows users to download files such as music, movies, and games using a P2P software client that searches for other connected computers. The "peers" are computer systems connected to each other through internet. Thus, the only requirements for a computer to join...

 networks, for example the BitTorrent protocol. Once a part of the file is downloaded, the data is then checked against the hash key (known as a hash check). Upon this result, the data is kept or discarded.

Digital watermarking

Digital watermarking is distinctly different from data hashing
Hash function
A hash function is any algorithm or subroutine that maps large data sets to smaller data sets, called keys. For example, a single integer can serve as an index to an array...

. It is the process of altering the original data file, allowing for the subsequent recovery of embedded auxiliary data referred to as a watermark
Watermark
A watermark is a recognizable image or pattern in paper that appears as various shades of lightness/darkness when viewed by transmitted light , caused by thickness or density variations in the paper...

.

A subscriber, with knowledge of the watermark and how it is recovered, can determine (to a certain extent) whether significant changes have occurred within the data file. Depending on the specific method used, recovery of the embedded auxiliary data can be robust to post-processing (such as lossy compression
Lossy data compression
In information technology, "lossy" compression is a data encoding method that compresses data by discarding some of it. The procedure aims to minimize the amount of data that need to be held, handled, and/or transmitted by a computer...

).

If the data file to be retrieved is an image
Digital image
A digital image is a numeric representation of a two-dimensional image. Depending on whether or not the image resolution is fixed, it may be of vector or raster type...

, the provider can embed a watermark for protection purposes. The process allows tolerance to some change, while still maintaining an association with the original image file. Researchers have also developed techniques that embed components of the image within the image. This can help identify portions of the image that may contain unauthorized changes and even help in recovering some of the lost data.

A disadvantage of digital watermarking is that a subscriber cannot significantly alter some files without sacrificing the quality or utility of the data. This can be true of various files including image data
Digital image
A digital image is a numeric representation of a two-dimensional image. Depending on whether or not the image resolution is fixed, it may be of vector or raster type...

, audio data
Digital audio
Digital audio is sound reproduction using pulse-code modulation and digital signals. Digital audio systems include analog-to-digital conversion , digital-to-analog conversion , digital storage, processing and transmission components...

, and computer code
Source code
In computer science, source code is text written using the format and syntax of the programming language that it is being written in. Such a language is specially designed to facilitate the work of computer programmers, who specify the actions to be performed by a computer mostly by writing source...

.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK