Concept-oriented model
Encyclopedia
The concept-oriented model (COM) is a data model
Data model
A data model in software engineering is an abstract model, that documents and organizes the business data for communication between team members and is used as a plan for developing applications, specifically how data is stored and accessed....

 based on the following three principles:
  • Duality principle postulates that any element consists of two parts, called identity and entity. Accordingly, data modelling is divided into two orthogonal branches: identity modelling and entity modelling.
  • Inclusion principle postulates that elements exist within a hierarchy
    Hierarchy
    A hierarchy is an arrangement of items in which the items are represented as being "above," "below," or "at the same level as" one another...

     where each of them has a super-element specified via inclusion relation. All elements are identified via hierarchical domain-specific addresses.
  • Order principle postulates that elements exist within a partially ordered set
    Order theory
    Order theory is a branch of mathematics which investigates our intuitive notion of order using binary relations. It provides a formal framework for describing statements such as "this is less than that" or "this precedes that". This article introduces the field and gives some basic definitions...

     where each of them has a number of greater and lesser elements. It is assumed that a reference stored in this element represents a greater element.


A data element in the concept-oriented model is defined as a couple consisting of one identity and one entity both having domain-specific structure. This model uses two orthogonal relations for data organization and manipulation: inclusion and partial order. Thus any element participates in two structures simultaneously: it is a member of a hierarchy (tree) and it is a member of a partially ordered set. The main purpose of the hierarchical structure consists in modelling hierarchical address spaces where each element has a unique domain-specific identity. The main purpose of the partial order consists in describing data semantics.

The name of this model originates from the main data modelling construct, concept, which generalizes conventional classes. This new approach to data modelling has been developed by Alexandr Savinov since 2004 along with a novel approach to programming, called concept-oriented programming.

Concepts

In COM, types of elements are described by a novel data modelling construct, called concept. Concept is defined as a couple of two classes: identity class and entity class. Concept fields are referred to as dimensions to emphasize their role in describing multi-dimensional structure. For example, a customer could be described by the following concept:

CONCEPT Customer
IDENTITY
CHAR(10) SSN
ENTITY
CHAR(64) name
DATE dob

A concept instance consists of one identity and one entity. Identity is a domain-specific address or reference representing the entity. It is important that identities are not analogous to primary keys. Rather, they can be thought of as domain-specific surrogates
Surrogate key
A surrogate key in a database is a unique identifier for either an entity in the modeled world or an object in the database. The surrogate key is not derived from application data.- Definition :There are at least two definitions of a surrogate:...

.

Concepts generalize conventional classes and are used where classes are normally used to declare the type of elements. In particular, concepts are used to declare the type of elements of collections. For example, we could create a table for storing customers using the following SQL-like query:

CREATE TABLE Customers CONCEPT Customer

This table will contain only entity part of customers while identity part defines its address space. In terms of tables, entity class describes the horizontal structure (columns) and identity class describes the vertical structure (row addresses).

Concept Inclusion

Each concept has a super-concept which is declared via inclusion relation which generalizes inheritance
Inheritance (object-oriented programming)
In object-oriented programming , inheritance is a way to reuse code of existing objects, establish a subtype from an existing object, or both, depending upon programming language support...

. Concept instances are identified relative to their parent instance so that elements have unique hierarchical addresses which are analogous to the conventional postal addresses. For example, if streets are identified relative to cities then we use inclusion to describe this hierarchical relationship:

CONCEPT City
IDENTITY
CHAR(10) name
ENTITY
DOUBLE population

CONCEPT Street IN City
IDENTITY
CHAR(10) name
ENTITY
DOUBLE length

Parent and child elements are stored in different tables and one parent may have many children. After such declaration streets will be represented by two segments: the first segment is the city and the second segment is the street itself.

Concept Ordering

Each concept has a number of greater concepts specified via its dimension (field) types and this structure must be a partially ordered set. For example, assume that bank account concept has a dimension referencing its owner:

CONCEPT Customer // It is a greater concept
IDENTITY
CHAR(10) SSN
ENTITY
CHAR(64) name
DATE dob

CONCEPT Account // It is a lesser concept
IDENTITY
CHAR(10) accNo
ENTITY
Customer owner // Dimension type is a greater concept
DOUBLE balance

Here the fact that an account stores a reference to a customer entails the fact that accounts are less than customers in the partially ordered set (in a diagram Account is positioned under Customer).

Projection and De-Projection

Operation of projection, denoted by right arrow, is applied to a set of elements and returns a set of their greater elements along the specified dimension. For example, given a set of accounts we can find their owners:

AllOwners = Accounts -> owner -> Customers

Operation of de-projection, denoted by left arrow, is applied to a set of elements and returns a set of their lesser elements along the specified dimension. For example, given a set of customers we can find their accounts:

AllAccounts = Customers <- owner <- Accounts

A sequence of projections and de-projections is referred to as an access path. Intermediate collections can involve constraints which can contain internal projections/de-projections with aggregation operations. For example, the following query returns accounts with small balance belonging to young customers who also have an account with big balance:

ResultSet = (Accounts WHERE balance > 1000)
-> owner -> (Customers WHERE age < 25)
<- owner <- (Accounts WHERE balance < 100)

First, we select accounts with large balance (line 1). Then these accounts are projected to young customers by selecting their owners (line 2). Finally, these young rich customers (who may have also other accounts) are de-projected down to accounts with small balance (line 3).

Example

The database schema shown in the diagram consists of 6 tables. Each table has two parts: identity and entity. (Note that identity part is not a primary key - it can be thought of as a domain-specific surrogate.) For example, banks are identified by their BLZ
Bankleitzahl
Bankleitzahl is a bank identifier code system used by German and Austrian banks....

 which acts as a domain-specific reference to the bank entity.

Each table is an element of the inclusion hierarchy, that is, each table has one super-table which is always positioned on the left. This relation is shown by dark blue leftward arrows. Tables which do not have an explicitly specified super-table are supposed to be included in the root which is the database. The basic idea behind inclusion is that any row of this table is always identified relative to some row of the super-table. For example, table Addresses is included in table Cities which means that any address is within some city and hence a fully qualified identifier of an address consists of two segments: a city id and the address itself. In the same way table Accounts is included in table Banks and hence any bank account is identifined by (i) the bank where it has been created, and (ii) account number. Such complex references are then stored as values in other tables by uniquely identifying the rows. For example, complex reference <11122233>:<111111111> represents account number 111111111 created in bank 11122233. If accounts could have internal savings accounts then a new sub-table SavingsAccounts had to be included in table Accounts. In this case one savings account would be identified by a complex reference consisting of three segments: bank, account, savings account number.

Each table in the database schema is an element of the partially ordered set where it has lesser and greater tables. Partial order relation among tables is shown by dark red upward arrows, that is, each upward arrow leads from the lesser table to a greater table. Here the main principle is that column types determine greater tables. For example, table Persons has a column which stores the person address and hence table Addresses is a greater table for Persons. Note that table Addresses is also greater than table Banks because banks also have an address. Tables Persons and Accounts have one common lesser table PersonsAccounts. It is used to implement a many-to-many relationship between them by storing account ownership information. It is also the most specific table in the model which is also called bottom table.

See also

  • Database
    Database
    A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...

  • Entity-relationship model
    Entity-relationship model
    In software engineering, an entity-relationship model is an abstract and conceptual representation of data. Entity-relationship modeling is a database modeling method, used to produce a type of conceptual schema or semantic data model of a system, often a relational database, and its requirements...

  • Formal concept analysis
    Formal concept analysis
    Formal concept analysis is a principled way of automatically deriving an ontology from a collection of objects and their properties. The term was introduced by Rudolf Wille in 1984, and builds on applied lattice and order theory that was developed by Birkhoff and others in the 1930s.-Intuitive...

  • Inverse dimension
    Inverse dimension
    In the concept-oriented model dimensions are used to link subconcepts with their superconcepts. Thus dimension is a named position of superconcept within one subconcept. Inverse dimension is produced from dimension by inverting its direction. Thus inverse dimensions identify subconcepts for a...

  • Object database
    Object database
    An object database is a database management system in which information is represented in the form of objects as used in object-oriented programming...

  • OLAP
    OLAP
    In computing, online analytical processing, or OLAP , is an approach to swiftly answer multi-dimensional analytical queries. OLAP is part of the broader category of business intelligence, which also encompasses relational reporting and data mining...

  • Ontology (information science)
  • Relational model
    Relational model
    The relational model for database management is a database model based on first-order predicate logic, first formulated and proposed in 1969 by Edgar F...


Further reading





  • Alexandr Savinov (2005). "Hierarchical Multidimensional Modelling in the Concept-Oriented Data Model". In: Proceedings of the CLA 2005 International Workshop on Concept Lattices and their Applications Olomouc, Czech Republic, September 7-9, 2005. Edited by Radim Belohlavek and Vaclav Snasel. PDF

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK