Data element definition
Encyclopedia
In metadata
Metadata
The term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...

, a data element definition is a human readable phrase or sentence associated with a data element
Data element
In metadata, the term data element is an atomic unit of data that has precise meaning or precise semantics. A data element has:# An identification such as a data element name# A clear data element definition# One or more representation terms...

 within a data dictionary
Data dictionary
A data dictionary, or metadata repository, as defined in the IBM Dictionary of Computing, is a "centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format." The term may have one of several closely related meanings pertaining to...

 that describes the meaning or semantics
Semantics
Semantics is the study of meaning. It focuses on the relation between signifiers, such as words, phrases, signs and symbols, and what they stand for, their denotata....

 of a data element.

Data element definitions are critical for external users of any data system. Good definitions can dramatically ease the process of mapping one set of data into another set of data. This is a core feature of distributed computing and intelligent agent development.

There are several guidelines that should be followed when creating high-quality data element definitions.

Properties of Clear Definitions

A good definition is:
  1. Precise - The definition should use words that have a precise meaning. Try to avoid words that have multiple meanings or multiple word sense
    Word sense
    In linguistics, a word sense is one of the meanings of a word.For example a dictionary may have over 50 different meanings of the word , each of these having a different meaning based on the context of the word usage in a sentence...

    s.
  2. Concise - The definition should use the shortest description possible that is still clear.
  3. Non Circular - The definition should not use the term you are trying to define in the definition itself. This is known as a circular definition.
  4. Distinct - The definition should differentiate a data element from other data elements. This process is called disambiguation.
  5. Unencumbered - The definition should be free of embedded rationale, functional usage, domain information, or procedural information.


A data element definition is a required property when adding data elements to a metadata registry
Metadata registry
A metadata registry is a central location in an organization where metadata definitions are stored and maintained in a controlled method.-Use of Metadata Registries:...

.

Definitions should not refer to terms or concepts that might be misinterpreted by others or that have different meanings based on the context of a situation. Definitions should not contain acronyms that are not clearly defined or linked to other precise definitions.

If you are creating a large number of data elements, all the definitions should be consistent with related concepts.
Critical Data Element -- Not all data elements are of equal importance or value to an organization. A key metadata property of an element is categorizing the data as a Critical Data Element (CDE). This categorization provides focus for data governance
Data governance
Data governance is an emerging discipline with an evolving definition. The discipline embodies a convergence of data quality, data management, data policies, business process management, and risk management surrounding the handling of data in an organization...

 and data quality
Data quality
Data are of high quality "if they are fit for their intended uses in operations, decision making and planning" . Alternatively, the data are deemed of high quality if they correctly represent the real-world construct to which they refer...

. An organization often has various sub-categories of CDEs, based on use of the data. e.g.,
  1. Security Coverage -- data elements that are categorized as personal health information
    Personal health record
    A personal health record or PHR is a health record where health data is curated by an individual user themselves. This stands in contrast with the more widely used electronic medical record which is held by institutions such as a hospital and contains data entered by clinicians or billing data in...

     or PHI warrant particular attention for security and access
  2. Marketing Department Usage -- the Marketing department could have a particular set of CDEs identified for identifying Unique Customer or for Campaign Management
  3. Finance Department Usage -- the Finance department could have a different set of CDEs from Marketing. They are focused on data elements which provide measures and metrics for fiscal reporting

Standards such as the ISO/IEC 11179
ISO/IEC 11179
ISO/IEC 11179 is an international standard for representing metadata for an organization in a metadata registry.- Intended purpose :...

 Metadata Registry specification give guidelines for creating precise data element definitions. Specifically chapter four of the ISO/IEC 11179 metadata registry standard covers data element definition quality standards http://standards.iso.org/ittf/PubliclyAvailableStandards/c035346_ISO_IEC_11179-4_2004(E).zip.

Using precise words

Common words such as play or run frequently have many meanings. For example the WordNet database documents over 57 different distinct meanings for the word "play" but only a single definition for the term dramatic play. Fewer definitions in a chosen word's dictionary entry is preferable. This minimizes misinterpretation related to a reader's context and background. The process of finding a good meaning of a word is called Word sense disambiguation
Word sense disambiguation
In computational linguistics, word-sense disambiguation is an open problem of natural language processing, which governs the process of identifying which sense of a word is used in a sentence, when the word has multiple meanings...

.

Examples of definitions that could be improved

Here is the definition of "person" data element as defined in the www.w3c.org Friend of a Friend specification*:

Person: A person.

Although most people do have an intuitive understanding of what a person is, the definition has much room for improvement. The first problem is that the definition is circular. Note that this definition really does not help most readers and needs to be clarified.

Here is the definition of the "Person" Data Element in the Global Justice XML Data Model 3.0 *:

person: Describes inherent and frequently associated characteristics of a person.

Note that once again the definition is still circular. Person should not reference itself. The definition should use terms other than person to describe what a person is.

Here is a more precise but shorter definition of a person:

Person: An individual human being.

Note that it uses the word individual to state that this is an instance of a class of things called human being. Technically you might use "homo sapiens" in your definition, but more people are familiar with the term "human being" than "homo sapiens," so commonly used terms, if they are still precise, are always preferred.

Sometimes your system may have cultural norms and assumptions in the definitions. For example if your "Person" data element tracked characters in a science fiction series that included aliens you may need a more general term other than human being.

Person: An individual of a sentient species.

See also

  • Data dictionary
    Data dictionary
    A data dictionary, or metadata repository, as defined in the IBM Dictionary of Computing, is a "centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format." The term may have one of several closely related meanings pertaining to...

  • Data element
    Data element
    In metadata, the term data element is an atomic unit of data that has precise meaning or precise semantics. A data element has:# An identification such as a data element name# A clear data element definition# One or more representation terms...

  • Global Justice XML Data Model
    GJXDM
    The Global Justice XML Data Model is a data reference model for the exchange of information within the justice and public safety communities...

  • NIEM
  • ISO/IEC 11179
    ISO/IEC 11179
    ISO/IEC 11179 is an international standard for representing metadata for an organization in a metadata registry.- Intended purpose :...

  • Metadata
    Metadata
    The term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...

  • Metadata registry
    Metadata registry
    A metadata registry is a central location in an organization where metadata definitions are stored and maintained in a controlled method.-Use of Metadata Registries:...

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK