Dimension table
Encyclopedia
In data warehousing, a dimension table is one of the set of companion tables to a fact table
Fact table
In data warehousing, a fact table consists of the measurements, metrics or facts of a business process. It is often located at the centre of a star schema or a snowflake schema, surrounded by dimension tables....

.

The fact table contains business facts or measures and foreign key
Foreign key
In the context of relational databases, a foreign key is a referential constraint between two tables.A foreign key is a field in a relational table that matches a candidate key of another table...

s which refer to candidate key
Candidate key
In the relational model of databases, a candidate key of a relation is a minimal superkey for that relation; that is, a set of attributes such that# the relation does not have two distinct tuples In the relational model of databases, a candidate key of a relation is a minimal superkey for that...

s (normally primary keys) in the dimension tables.

Contrary to fact tables, the dimension tables contain descriptive attributes (or fields) which are typically textual fields or discrete numbers that behave like text. These attributes are designed to serve two critical purposes: query constraining/filtering and query result set labeling.

Dimension attributes are supposed to be:
  • Verbose - labels consisting of full words,
  • Descriptive,
  • Complete - no missing values,
  • Discretely valued - only one value per row in dimensional table,
  • Quality assured - no misspelling, no impossible values.


Dimension table rows are uniquely identified by a single key field. It is recommended that the key field is a simple integer for the reason that key value is meaningless and is only used to be join fields between the fact and dimension tables.

The usage of surrogate dimension keys brings several advantages among:
  • Performance - join processing is much more efficient if a single field surrogate key
    Surrogate key
    A surrogate key in a database is a unique identifier for either an entity in the modeled world or an object in the database. The surrogate key is not derived from application data.- Definition :There are at least two definitions of a surrogate:...

     is used,
  • Buffer from operational key management practices - prevents form situations when removed data rows might reappear when their natural keys might be reused or reassigned after a long period of dormancy,
  • Mapping to integrate disparate sources,
  • Handle unknown or not applicable connections,
  • Track changes in dimension attribute values.


Usage of surrogate keys also brings an additional costs due the burden put on the ETL
Extract, transform, load
Extract, transform and load is a process in database usage and especially in data warehousing that involves:* Extracting data from outside sources* Transforming it to fit operational needs...

 system. Still, pipeline processing can be improved and ETL tools have built-in improved surrogate key processing.

The goal of dimension table is to create standardized conformed dimensions which can be shared across the enterprise's data warehouse environment and joining to multiple fact table representing various business processes.

Conformed dimensions are highly important to enterprise nature of DW/BI system for following reasons:
  • Consistency - every fact table is filtered consistently, result query answer are labeled consistently,
  • Integration - queries are able to drill different processes fact tables separately for each individual fact table and then join the results on common dimension attributes,
  • Reduced development time to market - the common dimensions are available without recreating the wheel over again.


Over time, the attributes of a given row in a dimension table may change. For example, the shipping address for a company may change. Kimball
Ralph Kimball
Ralph Kimball is an author on the subject of data warehousing and business intelligence. He is widely regarded as one of the original architects of data warehousing and is known for long-term convictions that data warehouses must be designed to be understandable and fast...

 refers to this phenomenon as Slowly Changing Dimension
Slowly Changing Dimension
Dimension is a term in data management and data warehousing that refers to logical groupings of data such as geographical location, customer information, or product information...

s. Strategies for dealing with this kind of change are divided into three categories:
  • Type One - Simply overwrite the old value(s).
  • Type Two - Add a new row containing the new value(s), and distinguish between the rows using Tuple-versioning
    Tuple-versioning
    Tuple-versioning is a mechanism used in a relational database management system to store past states of a relation. Normally, only the current state is captured....

    techniques.
  • Type Three - Add a new attribute to the existing row.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK