Database model
Encyclopedia
A database model is the theoretical foundation of a database
Database
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...

 and fundamentally determines in which manner data
Data
The term data refers to qualitative or quantitative attributes of a variable or set of variables. Data are typically the results of measurements and can be the basis of graphs, images, or observations of a set of variables. Data are often viewed as the lowest level of abstraction from which...

 can be stored, organized, and manipulated in a database system. It thereby defines the infrastructure offered by a particular database system. The most popular example of a database model is the relational model
Relational model
The relational model for database management is a database model based on first-order predicate logic, first formulated and proposed in 1969 by Edgar F...

.

Overview

A database model is a theory or specification describing how a database
Database
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...

 is structured and used. Several such models have been suggested.

Common models include:
  • Hierarchical model
  • Network model
    Network model
    The network model is a database model conceived as a flexible way of representing objects and their relationships. Its distinguishing feature is that the schema, viewed as a graph in which object types are nodes and relationship types are arcs, is not restricted to being a hierarchy or lattice.The...

  • Relational model
    Relational model
    The relational model for database management is a database model based on first-order predicate logic, first formulated and proposed in 1969 by Edgar F...

  • Entity-relationship
  • Object-relational model
  • Object model
    Object model
    In computing, object model has two related but distinct meanings:# The properties of objects in general in a specific computer programming language, technology, notation or methodology that uses them. For example, the Java objects model, the COM object model, or the object model of OMT...



A data model
Data model
A data model in software engineering is an abstract model, that documents and organizes the business data for communication between team members and is used as a plan for developing applications, specifically how data is stored and accessed....

 is not just a way of structuring data: it also defines a set of operations that can be performed on the data. The relational model, for example, defines operations such as select
Select (SQL)
The SQL SELECT statement returns a result set of records from one or more tables.A SELECT statement retrieves zero or more rows from one or more database tables or database views. In most applications, SELECT is the most commonly used Data Manipulation Language command...

 (project) and join
Join (SQL)
An SQL join clause combines records from two or more tables in a database. It creates a set that can be saved as a table or used as is. A JOIN is a means for combining fields from two tables by using values common to each. ANSI standard SQL specifies four types of JOINs: INNER, OUTER, LEFT, and RIGHT...

. Although these operations may not be explicit in a particular query language
Query language
Query languages are computer languages used to make queries into databases and information systems.Broadly, query languages can be classified according to whether they are database query languages or information retrieval query languages...

, they provide the foundation on which a query language is built.

Models

Various techniques are used to model data structure. Most database systems are built around one particular data model, although it is possible for products to offer support for more than one model. For any logical model
Logical data model
A logical data model in systems engineering is a representation of an organization's data, organized in terms of entities and relationships and is independent of any particular data management technology.- Overview :...

 various physical implementations may be possible, and most products will offer the user some level of control in tuning the physical implementation, since the choices that are made have a significant effect on performance.

Flat model


The flat (or table) model
Flat file database
A flat file database describes any of various means to encode a database model as a single file .- Overview :...

 consists of a single, two-dimensional array of data
Data
The term data refers to qualitative or quantitative attributes of a variable or set of variables. Data are typically the results of measurements and can be the basis of graphs, images, or observations of a set of variables. Data are often viewed as the lowest level of abstraction from which...

 elements, where all members of a given column are assumed to be similar values, and all members of a row are assumed to be related to one another. For instance, columns for name and password that might be used as a part of a system security database. Each row would have the specific password associated with an individual user. Columns of the table often have a type associated with them, defining them as character data, date or time information, integers, or floating point numbers. This may not strictly qualify as a data model, as defined above.

Hierarchical model


In a hierarchical model, data is organized into a tree-like structure, implying a single upward link in each record to describe the nesting, and a sort field to keep the records in a particular order in each same-level list. Hierarchical structures were widely used in the early mainframe database management systems, such as the Information Management System
Information Management System
IBM Information Management System is a joint hierarchical database and information management system with extensive transaction processing capabilities.- History :...

 (IMS) by IBM
IBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...

, and now describe the structure of XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....

 documents. This structure allows one 1:M relationship between two types of data. This structure is very efficient to describe many relationships in the real world; recipes, table of contents, ordering of paragraphs/verses, any nested and sorted information. However, the hierarchical structure is inefficient for certain database operations when a full path (as opposed to upward link and sort field) is not also included for each record.

Mother–child relationship: Child may only have one mother but a mother can have multiple children. Mothers and children are tied together by links called "pointers". A mother will have a list of pointers to each of her children.

Network model


The network model (defined by the CODASYL
CODASYL
CODASYL is an acronym for "Conference on Data Systems Languages". This was a consortium formed in 1959 to guide the development of a standard programming language that could be used on many computers...

 specification) organizes data using two fundamental concepts, called records and sets. Records contain fields (which may be organized hierarchically, as in the programming language COBOL
COBOL
COBOL is one of the oldest programming languages. Its name is an acronym for COmmon Business-Oriented Language, defining its primary domain in business, finance, and administrative systems for companies and governments....

). Sets (not to be confused with mathematical sets) define one-to-many
One-to-many
One-to-many may refer to:* Multivalued function, a one-to-many function in mathematics* Fat link, a one-to-many link in hypertext* Point-to-multipoint communication, communication which has a one-to-many relation-See also:*One-to-one...

 relationships between records: one owner, many members. A record may be an owner in any number of sets, and a member in any number of sets.

The network model is a variation on the hierarchical model, to the extent that it is built on the concept of multiple branches (lower-level structures) emanating from one or more nodes (higher-level structures), while the model differs from the hierarchical model in that branches can be connected to multiple nodes. The network model is able to represent redundancy in data more efficiently than in the hierarchical model.

The operations of the network model are navigational in style: a program maintains a current position, and navigates from one record to another by following the relationships in which the record participates. Records can also be located by supplying key values.

Although it is not an essential feature of the model, network databases generally implement the set relationships by means of pointers that directly address the location of a record on disk. This gives excellent retrieval performance, at the expense of operations such as database loading and reorganization.

Most object database
Object database
An object database is a database management system in which information is represented in the form of objects as used in object-oriented programming...

s use the navigational concept to provide fast navigation across networks of objects, generally using object identifiers as "smart" pointers to related objects. Objectivity/DB
Objectivity/DB
Objectivity/DB is a commercial object database produced by Objectivity, Inc. It allows applications to make standard C++, Java, Python or Smalltalk objects persistent without having to convert the data objects into the rows and columns used by a relational database management system ....

, for instance, implements named 1:1, 1:many, many:1 and many:many named relationships that can cross databases. Many object databases also support SQL
SQL
SQL is a programming language designed for managing data in relational database management systems ....

, combining the strengths of both models.

Relational model

The relational model
Relational model
The relational model for database management is a database model based on first-order predicate logic, first formulated and proposed in 1969 by Edgar F...

 was introduced by E.F. Codd in 1970 as a way to make database management systems more independent of any particular application. It is a mathematical model defined in terms of predicate logic
Predicate logic
In mathematical logic, predicate logic is the generic term for symbolic formal systems like first-order logic, second-order logic, many-sorted logic or infinitary logic. This formal system is distinguished from other systems in that its formulae contain variables which can be quantified...

 and set theory
Set theory
Set theory is the branch of mathematics that studies sets, which are collections of objects. Although any type of object can be collected into a set, set theory is applied most often to objects that are relevant to mathematics...

.

The products that are generally referred to as relational database
Relational database
A relational database is a database that conforms to relational model theory. The software used in a relational database is called a relational database management system . Colloquial use of the term "relational database" may refer to the RDBMS software, or the relational database itself...

s in fact implement a model that is only an approximation to the mathematical model defined by Codd. Three key terms are used extensively in relational database models: relation
Relation (database)
In relational model:A relation value, which is assigned to a certain relation variable, is time-varying. By using a Data Definition Language , it is able to define relation variables.The following is an example of a heading which consists of three attributes....

s
, attributes, and domain
Data domain
In data management and database analysis, a data domain refers to all the unique values which a data element may contain. The rule for determining the domain boundary may be as simple as a data type with an enumerated list of values....

s
. A relation is a table with columns and rows. The named columns of the relation are called attributes, and the domain is the set of values the attributes are allowed to take.

The basic data structure of the relational model is the table, where information about a particular entity (say, an employee) is represented in rows (also called tuple
Tuple
In mathematics and computer science, a tuple is an ordered list of elements. In set theory, an n-tuple is a sequence of n elements, where n is a positive integer. There is also one 0-tuple, an empty sequence. An n-tuple is defined inductively using the construction of an ordered pair...

s) and columns. Thus, the "relation
Relation (database)
In relational model:A relation value, which is assigned to a certain relation variable, is time-varying. By using a Data Definition Language , it is able to define relation variables.The following is an example of a heading which consists of three attributes....

" in "relational database" refers to the various tables in the database; a relation is a set of tuples. The columns enumerate the various attributes of the entity (the employee's name, address or phone number, for example), and a row is an actual instance of the entity (a specific employee) that is represented by the relation. As a result, each tuple of the employee table represents various attributes of a single employee.

All relations (and, thus, tables) in a relational database have to adhere to some basic rules to qualify as relations. First, the ordering of columns is immaterial in a table. Second, there can't be identical tuples or rows in a table. And third, each tuple will contain a single value for each of its attributes.

A relational database contains multiple tables, each similar to the one in the "flat" database model. One of the strengths of the relational model is that, in principle, any value occurring in two different records (belonging to the same table or to different tables), implies a relationship among those two records. Yet, in order to enforce explicit integrity constraints
Integrity constraints
Integrity constraints are used to ensure accuracy and consistency of data in a relational database. Data integrity is handled in a relational database through the concept of referential integrity...

,
relationships between records in tables can also be defined explicitly, by identifying or non-identifying parent-child relationships characterized by assigning cardinality (1:1, (0)1:M, M:M). Tables can also have a designated single attribute or a set of attributes that can act as a "key", which can be used to uniquely identify each tuple in the table.

A key that can be used to uniquely identify a row in a table is called a primary key. Keys are commonly used to join or combine data from two or more tables. For example, an Employee table may contain a column named Location which contains a value that matches the key of a Location table. Keys are also critical in the creation of indexes, which facilitate fast retrieval of data from large tables. Any column can be a key, or multiple columns can be grouped together into a compound key. It is not necessary to define all the keys in advance; a column can be used as a key even if it was not originally intended to be one.

A key that has an external, real-world meaning (such as a person's name, a book's ISBN, or a car's serial number) is sometimes called a "natural" key. If no natural key is suitable (think of the many people named Brown), an arbitrary or surrogate key can be assigned (such as by giving employees ID numbers). In practice, most databases have both generated and natural keys, because generated keys can be used internally to create links between rows that cannot break, while natural keys can be used, less reliably, for searches and for integration with other databases. (For example, records in two independently developed databases could be matched up by social security number
Social Security number
In the United States, a Social Security number is a nine-digit number issued to U.S. citizens, permanent residents, and temporary residents under section 205 of the Social Security Act, codified as . The number is issued to an individual by the Social Security Administration, an independent...

, except when the social security numbers are incorrect, missing, or have changed.)

Dimensional model

The dimensional model is a specialized adaptation of the relational model used to represent data in data warehouse
Data warehouse
In computing, a data warehouse is a database used for reporting and analysis. The data stored in the warehouse is uploaded from the operational systems. The data may pass through an operational data store for additional operations before it is used in the DW for reporting.A data warehouse...

s in a way that data can be easily summarized using OLAP
OLAP
In computing, online analytical processing, or OLAP , is an approach to swiftly answer multi-dimensional analytical queries. OLAP is part of the broader category of business intelligence, which also encompasses relational reporting and data mining...

 queries. In the dimensional model, a database schema consists of a single large table of facts that are described using dimensions and measures. A dimension provides the context of a fact (such as who participated, when and where it happened, and its type) and is used in queries to group related facts together. Dimensions tend to be discrete and are often hierarchical; for example, the location might include the building, state, and country. A measure is a quantity describing the fact, such as revenue. It's important that measures can be meaningfully aggregated--for example, the revenue from different locations can be added together.

In an OLAP query, dimensions are chosen and the facts are grouped and aggregated together to create a summary.

The dimensional model is often implemented on top of the relational model using a star schema
Star schema
In computing, the star schema is the simplest style of data warehouse schema. The star schema consists of one or more fact tables referencing any number of dimension tables...

, consisting of one highly normalized table containing the facts, and surrounding denormalized tables containing each dimension. An alternative physical implementation, called a snowflake schema
Snowflake schema
In computing, a snowflake schema is a logical arrangement of tables in a multidimensional database such that the entity relationship diagram resembles a snowflake in shape. The snowflake schema is represented by centralized fact tables which are connected to multiple dimensions.The snowflake schema...

, normalizes multi-level hierarchies within a dimension into multiple tables.

A data warehouse can contain multiple dimensional schemas that share dimension tables, allowing them to be used together. Coming up with a standard set of dimensions is an important part of dimensional modeling
Dimensional modeling
Dimensional modeling is the name of a set of techniques and concepts used in data warehouse design. It is considered to be different from entity-relationship modeling . Dimensional Modeling does not necessarily involve a relational database. The same modeling approach, at the logical level, can be...

.

Objectional database models

In recent years, the object-oriented paradigm has been applied to database technology, creating a new programming model known as object database
Object database
An object database is a database management system in which information is represented in the form of objects as used in object-oriented programming...

s. These databases attempt to bring the database world and the application programming world closer together, in particular by ensuring that the database uses the same type system
Type system
A type system associates a type with each computed value. By examining the flow of these values, a type system attempts to ensure or prove that no type errors can occur...

 as the application program. This aims to avoid the overhead (sometimes referred to as the impedance mismatch
Object-Relational impedance mismatch
The object-relational impedance mismatch is a set of conceptual and technical difficulties that are often encountered when a relational database management system is being used by a program written in an object-oriented programming language or style; particularly when objects or class definitions...

) of converting information between its representation in the database (for example as rows in tables) and its representation in the application program (typically as objects). At the same time, object databases attempt to introduce the key ideas of object programming, such as encapsulation and polymorphism, into the world of databases.

A variety of these ways have been tried for storing objects in a database. Some products have approached the problem from the application programming end, by making the objects manipulated by the program persistent
Persistence (computer science)
Persistence in computer science refers to the characteristic of state that outlives the process that created it. Without this capability, state would only exist in RAM, and would be lost when this RAM loses power, such as a computer shutdown....

. This also typically requires the addition of some kind of query language, since conventional programming languages do not have the ability to find objects based on their information content. Others have attacked the problem from the database end, by defining an object-oriented data model for the database, and defining a database programming language that allows full programming capabilities as well as traditional query facilities.

Object databases suffered because of a lack of standardization: although standards were defined by ODMG, they were never implemented well enough to ensure interoperability between products. Nevertheless, object databases have been used successfully in many applications: usually specialized applications such as engineering databases or molecular biology databases rather than mainstream commercial data processing. However, object database ideas were picked up by the relational vendors and influenced extensions made to these products and indeed to the SQL
SQL
SQL is a programming language designed for managing data in relational database management systems ....

language.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK