Data

Data

Discussion
Ask a question about 'Data'
Start a new discussion about 'Data'
Answer questions from other users
Full Discussion Forum
 
Encyclopedia
The term data refers to qualitative or quantitative attributes of a variable
Variable (mathematics)
In mathematics, a variable is a value that may change within the scope of a given problem or set of operations. In contrast, a constant is a value that remains unchanged, though often unknown or undetermined. The concepts of constants and variables are fundamental to many areas of mathematics and...

 or set of variables. Data (plural of "datum") are typically the results of measurements and can be the basis of graph
Graph (data structure)
In computer science, a graph is an abstract data structure that is meant to implement the graph and hypergraph concepts from mathematics.A graph data structure consists of a finite set of ordered pairs, called edges or arcs, of certain entities called nodes or vertices...

s, image
Image
An image is an artifact, for example a two-dimensional picture, that has a similar appearance to some subject—usually a physical object or a person.-Characteristics:...

s, or observations of a set of variables. Data are often viewed as the lowest level of abstraction
Abstraction
Abstraction is a process by which higher concepts are derived from the usage and classification of literal concepts, first principles, or other methods....

 from which information and then knowledge are derived. Raw data
Raw data
'\putang inaIn computing, it may have the following attributes: possibly containing errors, not validated; in sfferent formats; uncoded or unformatted; and suspect, requiring confirmation or citation. For example, a data input sheet might contain dates as raw data in many forms: "31st January...

, i.e. unprocessed data, refers to a collection of number
Number
A number is a mathematical object used to count and measure. In mathematics, the definition of number has been extended over the years to include such numbers as zero, negative numbers, rational numbers, irrational numbers, and complex numbers....

s, characters
Character (computing)
In computer and machine-based telecommunications terminology, a character is a unit of information that roughly corresponds to a grapheme, grapheme-like unit, or symbol, such as in an alphabet or syllabary in the written form of a natural language....

, images or other outputs from devices that collect information to convert physical quantities into symbols.

The word data ( , ˈdætə , or ˈdɑːtə ) is the Latin
Latin
Latin is an Italic language originally spoken in Latium and Ancient Rome. It, along with most European languages, is a descendant of the ancient Proto-Indo-European language. Although it is considered a dead language, a number of scholars and members of the Christian clergy speak it fluently, and...

 plural of datum, neuter
Grammatical gender
Grammatical gender is defined linguistically as a system of classes of nouns which trigger specific types of inflections in associated words, such as adjectives, verbs and others. For a system of noun classes to be a gender system, every noun must belong to one of the classes and there should be...

 past participle of dare, "to give", hence "something given". In discussions of problems in geometry
Geometry
Geometry arose as the field of knowledge dealing with spatial relationships. Geometry was one of the two fields of pre-modern mathematics, the other being the study of numbers ....

, mathematics
Mathematics
Mathematics is the study of quantity, space, structure, and change. Mathematicians seek out patterns and formulate new conjectures. Mathematicians resolve the truth or falsity of conjectures by mathematical proofs, which are arguments sufficient to convince other mathematicians of their validity...

, engineering
Engineering
Engineering is the discipline, art, skill and profession of acquiring and applying scientific, mathematical, economic, social, and practical knowledge, in order to design and build structures, machines, devices, systems, materials and processes that safely realize improvements to the lives of...

, and so on, the terms givens and data are used interchangeably. Also, data is a representation of a fact, figure, and idea. Such usage is the origin of data as a concept in computer science
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

: data are numbers, words, images, etc., accepted as they stand.

Usage in English


In English
English language
English is a West Germanic language that arose in the Anglo-Saxon kingdoms of England and spread into what was to become south-east Scotland under the influence of the Anglian medieval kingdom of Northumbria...

, the word datum is still used in the general sense of "an item given". In cartography
Cartography
Cartography is the study and practice of making maps. Combining science, aesthetics, and technique, cartography builds on the premise that reality can be modeled in ways that communicate spatial information effectively.The fundamental problems of traditional cartography are to:*Set the map's...

, geography
Geography
Geography is the science that studies the lands, features, inhabitants, and phenomena of Earth. A literal translation would be "to describe or write about the Earth". The first person to use the word "geography" was Eratosthenes...

, nuclear magnetic resonance
Nuclear magnetic resonance
Nuclear magnetic resonance is a physical phenomenon in which magnetic nuclei in a magnetic field absorb and re-emit electromagnetic radiation...

 and technical drawing
Technical drawing
Technical drawing, also known as drafting or draughting, is the act and discipline of composing plans that visually communicate how something functions or has to be constructed.Drafting is the language of industry....

 it is often used to refer to a single specific reference datum from which distances to all other data are measured. Any measurement or result is a datum, but data point is more usual, albeit tautological
Tautology (rhetoric)
Tautology is an unnecessary or unessential repetition of meaning, using different and dissimilar words that effectively say the same thing...

. Both datums (see usage in datum article) and the originally Latin plural data are used as the plural of datum in English, but data is commonly treated as a mass noun
Mass noun
In linguistics, a mass noun is a noun that refers to some entity as an undifferentiated unit rather than as something with discrete subsets. Non-count nouns are best identified by their syntactic properties, and especially in contrast with count nouns. The semantics of mass nouns are highly...

 and used with a verb in the singular
Grammatical number
In linguistics, grammatical number is a grammatical category of nouns, pronouns, and adjective and verb agreement that expresses count distinctions ....

 form, especially in day-to-day usage. For example, This is all the data from the experiment. This usage is inconsistent with the rules of Latin grammar and traditional English (These are all the data from the experiment). Even when a very small quantity of data is referenced (One number, for example) the phrase piece of data is often used, as opposed to datum. The debate over appropriate usage is ongoing.

The IEEE Computer Society
IEEE Computer Society
The IEEE Computer Society is a professional society of IEEE. Its purpose and scope is “to advance the theory, practice, and application of computer and information processing science and technology” and the “professional standing of its members.” The CS is the largest of 38 technical societies...

, allows usage of data as either a mass noun or plural based on author preference. Other professional organizations and style guides require that authors treat data as a plural noun. For example, the Air Force Flight Test Center
Air Force Flight Test Center
The Air Force Flight Test Center conducts research, development, test, and evaluation of aerospace systems from concept to deployment. It has test flown every aircraft in the U.S. Air Force's inventory since World War II...

 specifically states that the word data is always plural, never singular.

Data is accepted as a singular mass noun in everyday educated usage. Some major newspapers such as The New York Times
The New York Times
The New York Times is an American daily newspaper founded and continuously published in New York City since 1851. The New York Times has won 106 Pulitzer Prizes, the most of any news organization...

use it either in the singular or plural. In the New York Times the phrases "the survey data are still being analyzed" and "the first year for which data is available" have appeared within one day.

In scientific writing
Scientific writing
-History:Scientific writing in English started in the 14th century.The Royal Society established good practice for scientific writing. Founder member Thomas Sprat wrote on the importance of plain and accurate description rather than rhetorical flourishes in his History of the Royal Society of London...

 data is often treated as a plural, as in These data do not support the conclusions, but it is also used as a singular mass entity like information. British usage now widely accepts treating data as singular in standard English, including everyday newspaper usage at least in non-scientific use. UK scientific publishing still prefers treating it as a plural. Some UK university style guides recommend using data for both singular and plural use and some recommend treating it only as a singular in connection with computers.

Meaning of data, information and knowledge


The terms data
Data
The term data refers to qualitative or quantitative attributes of a variable or set of variables. Data are typically the results of measurements and can be the basis of graphs, images, or observations of a set of variables. Data are often viewed as the lowest level of abstraction from which...

, information
Information
Information in its most restricted technical sense is a message or collection of messages that consists of an ordered sequence of symbols, or it is the meaning that can be interpreted from such a message or collection of messages. Information can be recorded or transmitted. It can be recorded as...

 and knowledge
Knowledge
Knowledge is a familiarity with someone or something unknown, which can include information, facts, descriptions, or skills acquired through experience or education. It can refer to the theoretical or practical understanding of a subject...

 are frequently used for overlapping concepts. The main difference is in the level of abstraction
Abstraction
Abstraction is a process by which higher concepts are derived from the usage and classification of literal concepts, first principles, or other methods....

 being considered. Data is the lowest level of abstraction, information is the next level, and finally, knowledge is the highest level among all three. Data on its own carries no meaning. For data to become information, it must be interpreted and take on a meaning. For example, the height of Mt. Everest is generally considered as "data", a book on Mt. Everest geological characteristics may be considered as "information", and a report containing practical information on the best way to reach Mt. Everest's peak may be considered as "knowledge".

Information as a concept bears a diversity of meanings, from everyday usage to technical settings. Generally speaking, the concept of information is closely related to notions of constraint, communication, control, data, form, instruction, knowledge, meaning, mental stimulus, pattern, perception, and representation.

Beynon-Davies uses the concept of a sign
Sign
A sign is something that implies a connection between itself and its object. A natural sign bears a causal relation to its object—for instance, thunder is a sign of storm. A conventional sign signifies by agreement, as a full stop signifies the end of a sentence...

 to distinguish between data and information; data are symbols while information occurs when symbols are used to refer to something.
It is people and computers who collect data and impose patterns on it. These patterns are seen as information which can be used to enhance knowledge. These patterns can be interpreted as truth, and are authorized as aesthetic and ethical criteria. Events that leave behind perceivable physical or virtual remains can be traced back through data. Marks are no longer considered data once the link between the mark and observation is broken.

Raw data refers to a collection of number
Number
A number is a mathematical object used to count and measure. In mathematics, the definition of number has been extended over the years to include such numbers as zero, negative numbers, rational numbers, irrational numbers, and complex numbers....

s, characters
Character (computing)
In computer and machine-based telecommunications terminology, a character is a unit of information that roughly corresponds to a grapheme, grapheme-like unit, or symbol, such as in an alphabet or syllabary in the written form of a natural language....

, image
Image
An image is an artifact, for example a two-dimensional picture, that has a similar appearance to some subject—usually a physical object or a person.-Characteristics:...

s or other outputs from devices to convert physical quantities into symbols, that are unprocessed. Such data is typically further processed by a human or input
Input/output
In computing, input/output, or I/O, refers to the communication between an information processing system , and the outside world, possibly a human, or another information processing system. Inputs are the signals or data received by the system, and outputs are the signals or data sent from it...

 into a computer
Computer
A computer is a programmable machine designed to sequentially and automatically carry out a sequence of arithmetic or logical operations. The particular sequence of operations can be changed readily, allowing the computer to solve more than one kind of problem...

, stored and processed there, or transmitted (output
Output
Output is the term denoting either an exit or changes which exit a system and which activate/modify a process. It is an abstract concept, used in the modeling, system design and system exploitation.-In control theory:...

) to another human or computer (possibly through a data cable
Data cable
A data cable is any media that allows baseband transmissions from a transmitter to a receiver.Examples Are:*Networking Media**Ethernet Cables **Token Ring Cables **Coaxial cable...

). Raw data is a relative term; data processing commonly occurs by stages, and the "processed data" from one stage may be considered the "raw data" of the next.

Mechanical computing devices are classified according to the means by which they represent data. An analog computer
Analog computer
An analog computer is a form of computer that uses the continuously-changeable aspects of physical phenomena such as electrical, mechanical, or hydraulic quantities to model the problem being solved...

 represents a datum as a voltage, distance, position, or other physical quantity. A digital computer
Computer
A computer is a programmable machine designed to sequentially and automatically carry out a sequence of arithmetic or logical operations. The particular sequence of operations can be changed readily, allowing the computer to solve more than one kind of problem...

 represents a datum as a sequence of symbols drawn from a fixed alphabet
Alphabet
An alphabet is a standard set of letters—basic written symbols or graphemes—each of which represents a phoneme in a spoken language, either as it exists now or as it was in the past. There are other systems, such as logographies, in which each character represents a word, morpheme, or semantic...

. The most common digital computers use a binary alphabet, that is, an alphabet of two characters, typically denoted "0" and "1". More familiar representations, such as numbers or letters, are then constructed from the binary alphabet.

Some special forms of data are distinguished. A computer program
Computer program
A computer program is a sequence of instructions written to perform a specified task with a computer. A computer requires programs to function, typically executing the program's instructions in a central processor. The program has an executable form that the computer can use directly to execute...

 is a collection of data, which can be interpreted as instructions. Most computer languages make a distinction between programs and the other data on which programs operate, but in some languages, notably Lisp and similar languages, programs are essentially indistinguishable from other data. It is also useful to distinguish metadata
Metadata
The term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...

, that is, a description of other data. A similar yet earlier term for metadata is "ancillary data." The prototypical example of metadata is the library catalog, which is a description of the contents of books.

Experimental data
Experimental data
Experimental data in science is data produced by a measurement, test method, experimental design or quasi-experimental design. In clinical research any data produced as a result of clinical trial...

 refers to data generated within the context of a scientific investigation by observation and recording. Field data refers to raw data collected in an uncontrolled in situ
In situ
In situ is a Latin phrase which translated literally as 'In position'. It is used in many different contexts.-Aerospace:In the aerospace industry, equipment on board aircraft must be tested in situ, or in place, to confirm everything functions properly as a system. Individually, each piece may...

 environment.

See also


  • Biological data
    Biological data
    Biological data are data or measurements collected from biological sources, which are often stored or exchanged in a digital form. Biological data are commonly stored in files or databases...

  • Data acquisition
    Data acquisition
    Data acquisition is the process of sampling signals that measure real world physical conditions and converting the resulting samples into digital numeric values that can be manipulated by a computer. Data acquisition systems typically convert analog waveforms into digital values for processing...

  • Data analysis
    Data analysis
    Analysis of data is a process of inspecting, cleaning, transforming, and modeling data with the goal of highlighting useful information, suggesting conclusions, and supporting decision making...

  • Data cable
    Data cable
    A data cable is any media that allows baseband transmissions from a transmitter to a receiver.Examples Are:*Networking Media**Ethernet Cables **Token Ring Cables **Coaxial cable...

  • Data domain
    Data domain
    In data management and database analysis, a data domain refers to all the unique values which a data element may contain. The rule for determining the domain boundary may be as simple as a data type with an enumerated list of values....

  • Data element
    Data element
    In metadata, the term data element is an atomic unit of data that has precise meaning or precise semantics. A data element has:# An identification such as a data element name# A clear data element definition# One or more representation terms...

  • Data farming
    Data farming
    Data Farming is the process of using a high performance computer or computing grid to run a simulation thousands or millions of times across a large parameter and value space...

  • Data governance
    Data governance
    Data governance is an emerging discipline with an evolving definition. The discipline embodies a convergence of data quality, data management, data policies, business process management, and risk management surrounding the handling of data in an organization...

  • Data integrity
    Data integrity
    Data Integrity in its broadest meaning refers to the trustworthiness of system resources over their entire life cycle. In more analytic terms, it is "the representational faithfulness of information to the true state of the object that the information represents, where representational faithfulness...

  • Data maintenance
    Data maintenance
    Data maintenance is the adding, deleting, changing and updating of binary and high-level files, and the real world data associated with those files. Data can be maintained manually and/or through an automated program, but at origination and translation/delivery point must be translated into a...

  • Data management
    Data management
    Data management comprises all the disciplines related to managing data as a valuable resource.- Overview :The official definition provided by DAMA International, the professional organization for those in the data management profession, is: "Data Resource Management is the development and execution...

  • Data mining
    Data mining
    Data mining , a relatively young and interdisciplinary field of computer science is the process of discovering new patterns from large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics and database systems...

  • Data modeling
    Data modeling
    Data modeling in software engineering is the process of creating a data model for an information system by applying formal data modeling techniques.- Overview :...

  • Computer data processing
  • Data remanence
    Data remanence
    Data remanence is the residual representation of data that remains even after attempts have been made to remove or erase the data. This residue may result from data being left intact by a nominal file deletion operation, by reformatting of storage media that does not remove data previously written...

  • Data set
    Data set
    A data set is a collection of data, usually presented in tabular form. Each column represents a particular variable. Each row corresponds to a given member of the data set in question. Its values for each of the variables, such as height and weight of an object or values of random numbers. Each...

  • Data warehouse
    Data warehouse
    In computing, a data warehouse is a database used for reporting and analysis. The data stored in the warehouse is uploaded from the operational systems. The data may pass through an operational data store for additional operations before it is used in the DW for reporting.A data warehouse...

  • Database
    Database
    A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...

  • Datasheet
    Datasheet
    thumb|A floppy disk controller datasheet.A datasheet, data sheet, or spec sheet is a document summarizing the performance and other technical characteristics of a product, machine, component , material, a subsystem or software in sufficient detail to be used by a design engineer to integrate the...

  • Environmental data rescue
    Environmental data rescue
    Environmental data rescue is a collection of processes, including photography and scanning, that stores historical and modern environmental data in a usable format. The data is then analyzed and used in scientific models...

  • Fieldwork
  • Metadata
    Metadata
    The term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...

  • Scientific data archiving
    Scientific data archiving
    Scientific data archiving refers to the long-term storage of scientific data and methods. The various scientific journals have differing policies regarding how much of their data and methods scientists are required to store in a public archive, and what is actually archived varies widely between...

  • Statistics
    Statistics
    Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....

  • Datastructure


External links