All Topics  
Relational database

 

   Email Print
   Bookmark   Link






 

Relational database



 
 
A relational database is a database
Database

A database is a structured collection of records or data that is stored in a computer system. The structure is achieved by organizing the data according to a database model....
 that groups data using common attributes found in the data set. The resulting "clumps" of organized data are much easier for people to understand.

For example, a data set containing all the real estate transactions in a town can be grouped by the year the transaction occurred; or it can be grouped by the sale price of the transaction; or it can be grouped by the buyer's last name; and so on.

Such a grouping uses the relational model
Relational model

The relational model for database management is a database model based on first-order logic, first formulated and proposed in 1969 by Edgar F. Codd....
 (a technical term for this schema
Logical schema

A Logical Schema is a data model problem domain expressed in terms of a particular data management technology. Without being specific to a particular database management product, it is in terms of either relational tables and columns, object-oriented classes, or XML tags....
).






Discussion
Ask a question about 'Relational database'
Start a new discussion about 'Relational database'
Answer questions from other users
Full Discussion Forum



Encyclopedia


A relational database is a database
Database

A database is a structured collection of records or data that is stored in a computer system. The structure is achieved by organizing the data according to a database model....
 that groups data using common attributes found in the data set. The resulting "clumps" of organized data are much easier for people to understand.

For example, a data set containing all the real estate transactions in a town can be grouped by the year the transaction occurred; or it can be grouped by the sale price of the transaction; or it can be grouped by the buyer's last name; and so on.

Such a grouping uses the relational model
Relational model

The relational model for database management is a database model based on first-order logic, first formulated and proposed in 1969 by Edgar F. Codd....
 (a technical term for this schema
Logical schema

A Logical Schema is a data model problem domain expressed in terms of a particular data management technology. Without being specific to a particular database management product, it is in terms of either relational tables and columns, object-oriented classes, or XML tags....
). Hence such a database is called a "RELATIONAL DATABASE."

The software used to do this grouping is called a relational database management system
Relational database management system

A Relational database management system is a database management system that is based on the relational model as introduced by E. F. Codd. Most popular commercial and open source databases currently in use are based on the relational model....
. The term "relational database" often refers to this type of software.

Contents

Strictly, a relational database is a collection of relations
Relation (mathematics)

In mathematics , a relation is a property that assigns truth values to combinations of k first-order logic. Typically, the property describes a possible connection between the components of a k-tuple....
 (frequently called tables
Table (database)

In relational databases and flat file databases, a table is a set of data elements that is organized using a model of vertical column and horizontal row ....
). Other items are frequently considered part of the database, as they help to organize and structure the data, in addition to forcing the database to conform to a set of requirements.

Terminology


The term relational database was originally defined and coined by Edgar Codd at IBM Almaden Research Center in 1970.

Relational database theory uses a different set of mathematical-based terms, which are equivalent, or roughly equivalent, to SQL
SQL

SQL is a database computer language designed for the retrieval and management of data in relational database management systems , database schema creation and modification, and database object access control management....
 database terminology. The table below summarizes some of the most important relational database terms and their SQL database equivalents.

Relational term SQL equivalent
relation, base relvar table
derived relvar view, query result, result set
tuple row
attribute column


Relations or Tables


A relation is defined as a set of tuple
Tuple

In mathematics, a tuple is a sequence of a specific number of values, called the components of the tuple. These components can be any kind of mathematical objects, where each component of a tuple is a value of a specified type....
s that have the same attributes. A tuple usually represents an object and information about that object. Objects are typically physical objects or concepts. A relation is usually described as a table
Table (database)

In relational databases and flat file databases, a table is a set of data elements that is organized using a model of vertical column and horizontal row ....
, which is organized into rows
Row (database)

In the context of a relational database, a row?also called a record or tuple?represents a single, implicitly structured data item in a table ....
 and columns
Column (database)

In the context of a relational database Table , a column is a set of data values of a particular simple datatype, one for each Row of the table....
. All the data referenced by an attribute are in the same domain
Domain (mathematics)

In mathematics, the domain of a given function is the set of "input" values for which the function is defined. For instance, the domain of cosine would be all real numbers, while the domain of the square root would be only numbers greater than or equal to 0 ....
 and conform to the same constraints.

The relational model specifies that the tuples of a relation have no specific order and that the tuples, in turn, impose no order on the attributes. Applications access data by specifying queries, which use operations such as select to identify tuples, project to identify attributes, and join to combine relations. Relations can be modified using the insert, delete, and update operators. New tuples can supply explicit values or be derived from a query. Similarly, queries identify tuples for updating or deleting.

Base and derived relations


In a relational database, all data are stored and accessed via relations. Relations that store data are called "base relations", and in implementations are called "tables". Other relations do not store data, but are computed by applying relational operations to other relations. These relations are sometimes called "derived relations". In implementations these are called "views
View (database)

In database Database theory, a view consists of a stored database query accessible as a virtual Table composed of the result set of a Query language....
" or "queries". Derived relations are convenient in that though they may grab information from several relations, they act as a single relation. Also, derived relations can be used as an abstraction layer
Abstraction layer

An abstraction layer is a way of hiding the implementation details of a particular set of functionality. Software models that use layers of abstraction include the OSI model for computer network Protocol , the OpenGL graphics drawing library, and the byte stream input/output model originated by Unix and adopted by MSDOS, Linux, and most ot...
.
Domain

A domain describes the set of possible values for a given attribute. Because a domain constrains the attribute's values and name, it can be considered constraints. Mathematically, attaching a domain to an attribute means that "all values for this attribute must be an element of the specified set."

The character data value 'ABC', for instance, is not in the integer domain. The integer value 123, satisfies the domain constraint.

Constraints

Constraints allow you to further restrict the domain of an attribute. For instance, a constraint can restrict a given integer attribute to values between 1 and 10. Constraints provide one method of implementing business rules in the database. SQL implements constraint functionality in the form of check constraint
Check Constraint

A check constraint is a condition that defines valid data when adding or updating an entry in a table of a relational database. A check constraint is applied to each row in the table....
s.

Constraints restrict the data that can be stored in relations. These are usually defined using expressions that result in a boolean
Boolean

Boolean , as a noun or an adjective, may refer to:* Boolean algebra , a logical calculus of truth values or set membership* Boolean algebra , a set with operations resembling logical ones...
 value, indicating whether or not the data satisfies the constraint. Constraints can apply to single attributes, to a tuple (restricting combinations of attributes) or to an entire relation.

Since every attribute has an associated domain, there are constraints (domain constraints). The two principal rules for the relational model are known as entity integrity and referential integrity.

Foreign keys

A foreign key
Foreign key

In the context of relational databases, a foreign key is a referential integrity between two tables. The foreign key identifies a column or a set of columns in one table that refers to a column or set of columns in another table....
 is a reference
Reference

A reference is a relation between Object in which one object designates by linking to another object. Such relations as these may occur in a variety of domains, including logic, computer science, time, art and scholarship....
 to a key in another relation, meaning that the referencing tuple has, as one of its attributes, the values of a key in the referenced tuple. Foreign keys need not have unique values in the referencing relation. Foreign keys effectively use the values of attributes in the referenced relation to restrict the domain of one or more attributes in the referencing relation.

A foreign key could be described formally as: "For all tuples in the referencing relation projected over the referencing attributes, there must exist a tuple in the referenced relation projected over those same attributes such that the values in each of the referencing attributes match the corresponding values in the referenced attributes."

Stored procedures


A stored procedure is executable code that is associated with, and generally stored in, the database. Stored procedures usually collect and customize common operations, like inserting a tuple into a relation, gathering statistical information about usage patterns, or encapsulating complex business logic and calculations. Frequently they are used as an application programming interface
Application programming interface

An application programming interface is a set of subroutine, data structures, class and/or Protocol provided by library and/or operating system Service s in order to support the building of applications....
 (API) for security or simplicity. Implementations of stored procedures on SQL DBMSs often allow developers to take advantage of procedural
Procedural programming

Procedural programming can sometimes be used as a synonym for imperative programming , but can also refer to a programming paradigm based upon the concept of the procedure call....
 extensions (often vendor-specific) to the standard declarative
Declarative programming

In computer science, declarative programming is a programming paradigm that expresses the logic of a computation without describing its control flow....
 SQL syntax.

Stored procedures are not part of the relational database model, but all commercial implementations include them.

Indices


An index is one way of providing quicker access to data. Indices can be created on any combination of attributes on a relation. Queries that filter using those attributes can find matching tuples randomly using the index, without having to check each tuple in turn. Relational databases typically supply multiple indexing techniques, each of which is optimal for some combination of data distribution, relation size, and typical access pattern. B+ tree
B+ tree

In computer science, a B+ tree is a type of tree data structure which represents sorted data in a way that allows for efficient insertion, retrieval and removal of records, each of which is identified by a key....
s, R-tree
R-tree

R-trees are tree data structures that are similar to B-trees, but are used for spatial indexs i.e., for indexing multi-dimensional information; for example, the coordinates of geographical data....
s, and bitmaps
Bitmap Index

A bitmap index is a special kind of Index that uses bitmaps.Bitmap indexes have traditionally been considered to work well for data such as gender, which has a small number of distinct values, e.g., male and female, but many occurrences of those values....
.

Indices are usually not considered part of the database, as they are considered an implementation detail, though indices are usually maintained by the same group that maintains the other parts of the database.

Relational operations


Queries made against the relational database, and the derived relvars in the database are expressed in a relational calculus
Relational calculus

Relational calculus consist of two calculi, the tuple relational calculus and the domain relational calculus, that are part of the relational model for databases and provide a declarative way to specify database queries....
 or a relational algebra
Relational algebra

Relational algebra, an offshoot of first-order logic , deals with a set of mathematical relations Closure under operators. Operators operate on one or more relations to yield a relation....
. In his original relational algebra, Codd introduced eight relational operators in two groups of four operators each. The first four operators were based on the traditional mathematical set operations
Set theory

Set theory is the branch of mathematics that studies Set , which are collections of objects. Although any type of object can be collected into a set, set theory is applied most often to objects that are relevant to mathematics....
:

  • The union
    Union (set theory)

    In set theory, the term Union refers to a set operation used in the convergence of set elements to form a resultant set containing the elements of both sets....
     operator combines the tuples of two relations and removes all duplicate tuples from the result. The relational union operator is equivalent to the SQL UNION operator.
  • The intersection
    Intersection (set theory)

    In mathematics, the intersection of two Set A and B is the set that contains all elements of A that also belong to B , but no other elements....
     operator produces the set of tuples that two relations share in common. Intersection is implemented in SQL in the form of the INTERSECT operator.
  • The difference
    Complement (set theory)

    In discrete mathematics and predominantly in set theory, a complement is a concept used in comparisons of sets to refer to the unique values of one set in relation to another....
     operator acts on two relations and produces the set of tuples from the first relation that do not exist in the second relation. Difference is implemented in SQL in the form of the EXCEPT or MINUS operator.
  • The cartesian product
    Cartesian product

    In mathematics, the Cartesian product is a direct product of sets. The Cartesian product is named after Ren? Descartes, whose formulation of analytic geometry gave rise to this concept....
     of two relations is a join that is not restricted by any criteria, resulting in every tuple of the first relation being matched with every tuple of the second relation. The cartesian product is implemented in SQL as the CROSS JOIN join operator.


The remaining operators proposed by Codd involve special operations specific to relational databases:

  • The selection, or restriction, operation retrieves tuples from a relation, limiting the results to only those that meet a specific criteria, i.e. a subset
    Subset

    In mathematics, especially in set theory, a Set A is a subset of a set B if A is "contained" inside B. Notice that A and B may coincide....
     in terms of set theory. The SQL equivalent of selection is the SELECT query statement with a WHERE clause.
  • The projection operation is essentially a selection operation in which duplicate tuples are removed from the result. The SQL GROUP BY clause, or the DISTINCT keyword implemented by some SQL dialects, can be used to remove duplicates from a result set.
  • The join operation defined for relational databases is often referred to as a natural join. In this type of join, two relations are connected by their common attributes. SQL's approximation of a natural join is the INNER JOIN join operator.
  • The relational division
    Relational algebra

    Relational algebra, an offshoot of first-order logic , deals with a set of mathematical relations Closure under operators. Operators operate on one or more relations to yield a relation....
     operation is a slightly more complex operation, which involves essentially using the tuples of one relation (the dividend) to partition a second relation (the divisor). The relational division operator is effectively the opposite of the cartesian product operator (hence the name).


Other operators have been introduced or proposed since Codd's introduction of the original eight including relational comparison operators and extensions that offer support for nesting and hierarchical data, among others.

Normalization

Normalization was first proposed by Codd as an integral part of the relational model. It encompasses a set of best practices designed to eliminate the duplication of data, which in turn prevents data manipulation anomalies and loss of data integrity. The most common forms of normalization applied to databases are called the normal form
Database normalization

In the field of relational database design, normalization is a systematic way of ensuring that a database structure is suitable for general-purpose querying and free of certain undesirable characteristics?insertion, update, and deletion anomalies?that could lead to a loss of data integrity....
s. Normalization trades reducing redundancy for increased information entropy
Information entropy

In information theory, entropy is a measure of the uncertainty associated with a random variable. The term by itself in this context usually refers to the Shannon entropy, which quantifies, in the sense of an expected value, the self-information contained in a message, usually in units such as bits....
. Normalization is criticised because it increases complexity and processing overhead required to join multiple tables representing what are conceptually a single item .

Relational database management systems

Relational databases, as implemented in relational database management systems, have become a predominant choice for the storage of information in new databases used for financial records, manufacturing and logistical information, personnel data and much more. Relational databases have often replaced legacy hierarchical databases and network databases because they are easier to understand and use, even though they are much less efficient. As computer power has increased, the inefficiencies of relational databases, which made them impractical in earlier times, have been outweighed by their ease of use. However, relational databases have been challenged by Object Database
Object database

An object database is a database model in which information is represented in the form of Object as used in object-oriented programming.Object databases are generally recommended when there is a business need for high performance processing on complex data....
s, which were introduced in an attempt to address the object-relational impedance mismatch
Object-Relational impedance mismatch

The object-relational impedance mismatch is a set of conceptual and technical difficulties that are often encountered when a relational database management system is being used by a program written in an object-oriented programming language or style; particularly when objects or class definitions are mapped in a straightforward way to databas...
 in relational database, and XML database
XML database

An XML database is a data persistence software system that allows data to be stored in XML format. This data can then be XQuery, exported and serialized into any format the developer wishes....
s.

The three leading commercial relational database vendors are Oracle
Oracle Corporation

Oracle Corporation specializes in developing and marketing enterprise software products ? particularly database management systems. Through organic growth and a number of high-profile acquisitions, Oracle enlarged its share of the software market....
, Microsoft
Microsoft

Microsoft Corporation is a multinational corporation computer technology corporation that develops, manufactures, licenses, and supports a wide range of computer software products for computing devices....
, and IBM
IBM

International Business Machines Corporation, abbreviated IBM and nicknamed "Big Blue" , is a multinational corporation computer technology and consulting corporation headquartered in Armonk, New York, New York, United States....
. The leading open source
Open source

Open source is an approach to design, development, and distribution offering practical accessibility to a product's source . Some consider open source as one of various possible design approaches, while others consider it a critical Strategy element of their business operations....
 implementations are MySQL
MySQL

MySQL is a relational database management system which has more than 11 million installations. The program runs as a server providing multi-user access to a number of databases....
 and PostgreSQL
PostgreSQL

PostgreSQL is an object-relational database management system . It is released under a BSD licenses and is thus free software. As with many other open-source programs, PostgreSQL is not controlled by any single company, but has a global community of developers and companies to develop it....
.