Global serializability
Encyclopedia
In concurrency control
Concurrency control
In information technology and computer science, especially in the fields of computer programming , operating systems , multiprocessors, and databases, concurrency control ensures that correct results for concurrent operations are generated, while getting those results as quickly as possible.Computer...

 of database
Database
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...

s
, transaction processing
Transaction processing
In computer science, transaction processing is information processing that is divided into individual, indivisible operations, called transactions. Each transaction must succeed or fail as a complete unit; it cannot remain in an intermediate state...

(transaction management), and other transactional distributed applications
Distributed computing
Distributed computing is a field of computer science that studies distributed systems. A distributed system consists of multiple autonomous computers that communicate through a computer network. The computers interact with each other in order to achieve a common goal...

, Global serializability (or Modular serializability) is a property of a global schedule of transactions
Database transaction
A transaction comprises a unit of work performed within a database management system against a database, and treated in a coherent and reliable way independent of other transactions...

. A global schedule is the unified schedule
Schedule (computer science)
In the fields of databases and transaction processing , a schedule of a system is an abstract model to describe execution of transactions running in the system. Often it is a list of operations ordered by time, performed by a set of transactions that are executed together in the system...

 of all the individual database (and other transactional object) schedules in a multidatabase environment (e.g., federated database). Complying with global serializability means that the global schedule is serializable
Serializable
Serializable may refer to:* Serializable , an attribute of a transactions' schedule . It means that the schedule has the serializability property...

, has the serializability
Serializability
In concurrency control of databases, transaction processing , and various transactional applications , both centralized and distributed, a transaction schedule is serializable, has the serializability property, if its outcome In concurrency control of databases, transaction processing (transaction...

property, while each component database (module) has a serializable schedule as well. In other words, a collection of serializable components provides overall system serializability, which is usually incorrect. A need in correctness across databases in multidatabase systems makes global serializability a major goal for global concurrency control
Global concurrency control
Global concurrency control typically pertains to the concurrency control of a system comprising several components, each with its own concurrency control. The overall concurrency control of the whole system, the Global concurrency control, is determined by the concurrency control of its components,...

(or modular concurrency control). With the proliferation of the Internet
Internet
The Internet is a global system of interconnected computer networks that use the standard Internet protocol suite to serve billions of users worldwide...

, Cloud computing
Cloud computing
Cloud computing is the delivery of computing as a service rather than a product, whereby shared resources, software, and information are provided to computers and other devices as a utility over a network ....

, Grid computing
Grid computing
Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common goal. The grid can be thought of as a distributed system with non-interactive workloads that involve a large number of files...

, and small, portable, powerful computing devices (e.g., smartphone
Smartphone
A smartphone is a high-end mobile phone built on a mobile computing platform, with more advanced computing ability and connectivity than a contemporary feature phone. The first smartphones were devices that mainly combined the functions of a personal digital assistant and a mobile phone or camera...

s), as well as increase in systems management
Systems management
Systems management refers to enterprise-wide administration of distributed systems including computer systems. Systems management is strongly influenced by network management initiatives in telecommunications....

 sophistication, the need for atomic distributed transactions and thus effective global serializability techniques, to ensure correctness in and among distributed transactional applications, seems to increase.

In a federated database system
Federated database system
A federated database system is a type of meta-database management system , which transparently integrates multiple autonomous database systems into a single federated database. The constituent databases are interconnected via a computer network and may be geographically decentralized...

 or any other more loosely defined multidatabase system, which are typically distributed in a communication network, transactions span multiple (and possibly distributed) databases. Enforcing global serializability in such system, where different databases may use different types of concurrency control
Concurrency control
In information technology and computer science, especially in the fields of computer programming , operating systems , multiprocessors, and databases, concurrency control ensures that correct results for concurrent operations are generated, while getting those results as quickly as possible.Computer...

, is problematic. Even if every local schedule of a single database is serializable
Serializable
Serializable may refer to:* Serializable , an attribute of a transactions' schedule . It means that the schedule has the serializability property...

, the global schedule of a whole system is not necessarily serializable. The massive communication exchanges of conflict information needed between databases to reach conflict serializability globally would lead to unacceptable performance, primarily due to computer and communication latency
Latency (engineering)
Latency is a measure of time delay experienced in a system, the precise definition of which depends on the system and the time being measured. Latencies may have different meaning in different contexts.-Packet-switched networks:...

. Achieving global serializability effectively over different types of concurrency control has been open
Open problem
In science and mathematics, an open problem or an open question is a known problem that can be accurately stated, and has not yet been solved . Some questions remain unanswered for centuries before solutions are found...

 for several years. Commitment ordering
Commitment ordering
In concurrency control of databases, transaction processing , and related applications, Commitment ordering is a class of interoperable Serializability techniques, both centralized and distributed. It allows optimistic implementations...

(or Commit ordering; CO), a serializability
Serializability
In concurrency control of databases, transaction processing , and various transactional applications , both centralized and distributed, a transaction schedule is serializable, has the serializability property, if its outcome In concurrency control of databases, transaction processing (transaction...

 technique publicly introduced in 1991 by Yoav Raz from Digital Equipment Corporation
Digital Equipment Corporation
Digital Equipment Corporation was a major American company in the computer industry and a leading vendor of computer systems, software and peripherals from the 1960s to the 1990s...

 (DEC), provides an effective general solution for global (conflict) serializability across any collection of database systems and other transactional objects, with possibly different concurrency control mechanisms. CO does not need the distribution of conflict information, but rather utilizes the already needed (unmodified) atomic commitment protocol messages without any further communication between databases. It also allows optimistic
Optimistic concurrency control
In the field of relational database management systems, optimistic concurrency control is a concurrency control method that assumes that multiple transactions can complete without affecting each other, and that therefore transactions can proceed without locking the data resources that they affect...

 (non-blocking) implementations. CO generalizes Strong strict two phase locking
Two-phase locking
In databases and transaction processing two-phase locking, is a concurrency control method that guarantees serializability.It is also the name of the resulting set of database transaction schedules...

(SS2PL), which in conjunction with the Two-phase commit
Two-phase commit protocol
In transaction processing, databases, and computer networking, the two-phase commit protocol is a type of atomic commitment protocol . It is a distributed algorithm that coordinates all the processes that participate in a distributed atomic transaction on whether to commit or abort the transaction...

(2PC) protocol is the de facto standard
De facto standard
A de facto standard is a custom, convention, product, or system that has achieved a dominant position by public acceptance or market forces...

 for achieving global serializability across (SS2PL based) database systems. As a result CO compliant database systems (with any, different concurrency control types) can transparently join existing SS2PL based solutions for global serializability. The same applies also to all other multiple (transactional) object systems that use atomic transactions and need global serializability for correctness (see examples above; nowadays such need is not smaller than with database systems, the origin of atomic transactions).

The most significant aspects of CO that make it a uniquely effective general solution for global serializability are the following:
  1. Seamless, low overhead integration with any concurrency control mechanism, with neither changing any transaction's operation scheduling or blocking it, nor adding any new operation.
  2. Heterogeneity: Global serializability is achieved across multiple transactional objects (e.g., database management system
    Database management system
    A database management system is a software package with computer programs that control the creation, maintenance, and use of a database. It allows organizations to conveniently develop databases for various applications by database administrators and other specialists. A database is an integrated...

    s) with different (any) concurrency control mechanisms, without interfering with the mechanisms' operations.
  3. Modularity
    Modularity
    Modularity is a general systems concept, typically defined as a continuum describing the degree to which a system’s components may be separated and recombined. It refers to both the tightness of coupling between components, and the degree to which the “rules” of the system architecture enable the...

    : Transactional objects can be added and removed transparently.
  4. Autonomy of transactional objects: No need of conflict or equivalent information distribution (e.g., local precedence relations, locks, timestamps, or tickets; no object needs other object's information).
  5. Scalability
    Scalability
    In electronics scalability is the ability of a system, network, or process, to handle growing amount of work in a graceful manner or its ability to be enlarged to accommodate that growth...

    : With "normal" global transactions, computer network
    Computer network
    A computer network, often simply referred to as a network, is a collection of hardware components and computers interconnected by communication channels that allow sharing of resources and information....

     size and number of transactional objects can increase unboundedly with no impact on performance, and
  6. Automatic global deadlock resolution.


All these aspects, except the first two, are also possessed by the popular SS2PL
Two-phase locking
In databases and transaction processing two-phase locking, is a concurrency control method that guarantees serializability.It is also the name of the resulting set of database transaction schedules...

, which is a (constrained, blocking) special case of CO and inherits many of CO's qualities.

Problem statement

The difficulties described above translate into the following problem:
Find an efficient (high-performance and fault tolerant) method to enforce Global serializability (global conflict serializability) in a heterogeneous distributed environment of multiple autonomous database systems. The database systems may employ different concurrency control
Concurrency control
In information technology and computer science, especially in the fields of computer programming , operating systems , multiprocessors, and databases, concurrency control ensures that correct results for concurrent operations are generated, while getting those results as quickly as possible.Computer...

 methods. No limitation should be imposed on the operations of either local transactions (confined to a single database system) or global transactions
Distributed transaction
A distributed transaction is an operations bundle, in which two or more network hosts are involved. Usually, hosts provide transactional resources, while the transaction manager is responsible for creating and managing a global transaction that encompasses all operations against such resources...

 (span two or more database systems).

Quotations

Lack of an appropriate solution for the global serializability problem has driven researchers to look for alternatives to serializability
Serializability
In concurrency control of databases, transaction processing , and various transactional applications , both centralized and distributed, a transaction schedule is serializable, has the serializability property, if its outcome In concurrency control of databases, transaction processing (transaction...

 as a correctness criterion in a multidatabase environment (e.g., see Relaxing global serializability below), and the problem has been characterized as difficult and open
Open problem
In science and mathematics, an open problem or an open question is a known problem that can be accurately stated, and has not yet been solved . Some questions remain unanswered for centuries before solutions are found...

. The following two quotations demonstrate the mindset about it by the end of the year 1991, with similar quotations in numerous other articles:
  • "Without knowledge about local as well as global transactions, it is highly unlikely that efficient global concurrency control can be provided... Additional complications occur when different component DBMSs [Database Management Systems] and the FDBMSs [Federated Database Management Systems] support different concurrency mechanisms... It is unlikely that a theoretically elegant solution that provides conflict serializability without sacrificing performance (i.e., concurrency and/or response time) and availability
    Availability
    In telecommunications and reliability theory, the term availability has the following meanings:* The degree to which a system, subsystem, or equipment is in a specified operable and committable state at the start of a mission, when the mission is called for at an unknown, i.e., a random, time...

     exists."


Commitment ordering
Commitment ordering
In concurrency control of databases, transaction processing , and related applications, Commitment ordering is a class of interoperable Serializability techniques, both centralized and distributed. It allows optimistic implementations...

, publicly introduced in May 1991 (see below), provides an efficient elegant
Elegance
Elegance is a synonym for beauty that has come to acquire the additional connotations of unusual effectiveness and simplicity. It is frequently used as a standard of tastefulness particularly in the areas of visual design, decoration, the sciences, and the esthetics of mathematics...

 general solution, from both practical and theoretical
Theory
The English word theory was derived from a technical term in Ancient Greek philosophy. The word theoria, , meant "a looking at, viewing, beholding", and referring to contemplation or speculation, as opposed to action...

 points of view, to the global serializability problem across database systems with possibly different concurrency control mechanisms. It provides conflict serializability with no negative effect on availability, and with no worse performance than the de facto standard
De facto standard
A de facto standard is a custom, convention, product, or system that has achieved a dominant position by public acceptance or market forces...

 for global serializability, CO's special case Strong strict two-phase locking (SS2PL). It requires knowledge about neither local nor global transactions.
  • "Transaction management in a heterogeneous, distributed database system is a difficult issue. The main problem is that each of the local database management systems may be using a different type of concurrency control scheme. Integrating this is a challenging problem, made worse if we wish to preserve the local autonomy of each of the local databases, and allow local and global transactions to execute in parallel. One simple solution is to restrict global transactions to retrieve-only access. However, the issue of reliable transaction management in the general case, where global and local transactions are allowed to both read and write data, is still open
    Open problem
    In science and mathematics, an open problem or an open question is a known problem that can be accurately stated, and has not yet been solved . Some questions remain unanswered for centuries before solutions are found...

    ."


The Commitment ordering
Commitment ordering
In concurrency control of databases, transaction processing , and related applications, Commitment ordering is a class of interoperable Serializability techniques, both centralized and distributed. It allows optimistic implementations...

 solution comprises effective integration of autonomous database management systems with possibly different concurrency control mechanisms. This while local and global transactions execute in parallel without restricting any read or write operation in either local or global transactions, and without compromising the systems' autonomy.

Even in later years, after the public introduction of the Commitment ordering general solution in 1991, the problem still has been considered by many unsolvable:
  • "We present a transaction model for multidatabase systems with autonomous component systems, coined heterogeneous 3-level transactions. It has become evident that in such a system the requirements of guaranteeing full ACID
    ACID
    In computer science, ACID is a set of properties that guarantee database transactions are processed reliably. In the context of databases, a single logical operation on the data is called a transaction...

     properties and full local autonomy can not be reconciled..."


The quotation above is from a 1997 article proposing a relaxed global serializability solution (see Relaxing global serializability below), and referencing Commitment ordering
Commitment ordering
In concurrency control of databases, transaction processing , and related applications, Commitment ordering is a class of interoperable Serializability techniques, both centralized and distributed. It allows optimistic implementations...

 (CO) articles. The CO solution supports effectively both full ACID
ACID
In computer science, ACID is a set of properties that guarantee database transactions are processed reliably. In the context of databases, a single logical operation on the data is called a transaction...

 properties and full local autonomy, as well as meeting the other requirements posed above in the Problem statement section, and apparently has been misunderstood.

Similar thinking we see also in the following quotation from a 1998 article:
  • "The concept of serializability has been the traditionally accepted correctness criterion in database systems. However in multidatabase systems (MDBSs), ensuring global serializability is a difficult task. The difficulty arises due to the heterogeneity of the concurrency control protocols used by the participating local database management systems (DBMSs), and the desire to preserve the autonomy of the local DBMSs. In general, solutions to the global serializability problem result in executions with a low degree of concurrency. The alternative, relaxed serializability, may result in data inconsistency."


Also the above quoted article proposes a relaxed global serializability solution, while referencing the CO work. The CO solution for global serializability both bridges between different concurrency control protocols with no substantial concurrency reduction (and typically minor, if at all), and maintains the autonomy of local DBMSs. Evidently also here CO has been misunderstood. This misunderstanding continues to 2010 in a textbook by some of the same authors, where the same relaxed global serializability technique, Two level serializability, is emphasized and described in detail, and CO is not mentioned at all.

On the other hand, the following quotation on CO appears in a 2009 book:
  • "Not all concurrency control algorithms use locks... Three other techniques are timestamp ordering, serialization graph testing, and commit ordering. Timestamp ordering assigns each transaction a timestamp and ensures that conflicting operations execute in timestamp order. Serialization graph testing tracks conflicts and ensures that the serialization graph is acyclic. Commit ordering ensures that conflicting operations are consistent with the relative order in which their transactions commit, which can enable interoperability of systems using different concurrency control mechanisms."

Comments:
  1. Beyond the common locking based algorithm SS2PL, which is a CO variant itself, also additional variants of CO that use locks exist, (see below). However, generic, or "pure" CO does not use locks.
  2. Since CO mechanisms order the commit events according to conflicts that already have occurred, it is better to describe CO as "Commit ordering ensures that the relative order in which transactions commit is consistent with the order of their respective conflicting operations."


The characteristics and properties of the CO solution are discussed below.

Proposed solutions

Several solutions, some partial, have been proposed for the global serializability problem. Among them:
  • Global conflict graph (serializability graph, precedence graph
    Precedence graph
    A precedence graph, also named conflict graph and serializability graph, is used in the context of concurrency control in databases.The precedence graph for a schedule S contains:* A node for each committed transaction in S...

    ) checking
  • Distributed Two phase locking
    Two phase locking
    In databases and transaction processing two-phase locking, is a concurrency control method that guarantees serializability.It is also the name of the resulting set of database transaction schedules...

    (Distributed 2PL)
  • Distributed Timestamp ordering
    Timestamp-based concurrency control
    In computer science, in the field of databases, timestamp-based concurrency control is a non-lock concurrency control method, used in relational databases to safely handle transactions, using timestamps.-Assumptions:...

  • Tickets (local logical timestamps which define local total orders, and are propagated to determine global partial order of transactions)
  • Commitment ordering
    Commitment ordering
    In concurrency control of databases, transaction processing , and related applications, Commitment ordering is a class of interoperable Serializability techniques, both centralized and distributed. It allows optimistic implementations...


Technology perspective

The problem of global serializability has been a quite intensively researched subject in the late 1980s and early 1990s. Commitment ordering
Commitment ordering
In concurrency control of databases, transaction processing , and related applications, Commitment ordering is a class of interoperable Serializability techniques, both centralized and distributed. It allows optimistic implementations...

(CO) has provided an effective general solution to the problem, insight into it, and understanding about possible generalizations of strong strict two phase locking (SS2PL), which practically and almost exclusively has been utilized (in conjunction with the Two-phase commit protocol
Two-phase commit protocol
In transaction processing, databases, and computer networking, the two-phase commit protocol is a type of atomic commitment protocol . It is a distributed algorithm that coordinates all the processes that participate in a distributed atomic transaction on whether to commit or abort the transaction...

(2PC) ) since the 1980s to achieve global serializability across databases. An important side-benefit of CO is the automatic global deadlock resolution that it provides (this is applicable also to distributed SS2PL; though global deadlocks have been an important research subject for SS2PL, automatic resolution has been overlooked, except in the CO articles, until today (2009)). At that time quite many commercial database system types existed, many non-relational, and databases were relatively very small. Multi database systems were considered a key for database scalability by database systems interoperability, and global serializability was urgently needed. Since then the tremendous progress in computing power, storage, and communication networks, resulted in orders of magnitude
Order of magnitude
An order of magnitude is the class of scale or magnitude of any amount, where each class contains values of a fixed ratio to the class preceding it. In its most common usage, the amount being scaled is 10 and the scale is the exponent being applied to this amount...

 increases in both centralized databases' sizes, transaction rates, and remote access to database capabilities, as well as blurring the boundaries between centralized computing and distributed one over fast, low-latency local networks (e.g., Infiniband
InfiniBand
InfiniBand is a switched fabric communications link used in high-performance computing and enterprise data centers. Its features include high throughput, low latency, quality of service and failover, and it is designed to be scalable...

). These, together with progress in database vendors' distributed solutions (primarily the popular SS2PL with 2PC based, a de facto standard
De facto standard
A de facto standard is a custom, convention, product, or system that has achieved a dominant position by public acceptance or market forces...

 that allows interoperability among different vendors' (SS2PL-based) databases; both SS2PL and 2PC technologies have gained substantial expertise and efficiency), workflow
Workflow
A workflow consists of a sequence of connected steps. It is a depiction of a sequence of operations, declared as work of a person, a group of persons, an organization of staff, or one or more simple or complex mechanisms. Workflow may be seen as any abstraction of real work...

 management systems, and database replication technology, in most cases have provided satisfactory and sometimes better information technology
Information technology
Information technology is the acquisition, processing, storage and dissemination of vocal, pictorial, textual and numerical information by a microelectronics-based combination of computing and telecommunications...

 solutions without multi database atomic distributed transaction
Distributed transaction
A distributed transaction is an operations bundle, in which two or more network hosts are involved. Usually, hosts provide transactional resources, while the transaction manager is responsible for creating and managing a global transaction that encompasses all operations against such resources...

s over databases with different concurrency control (bypassing the problem above). As a result, the sense of urgency that existed with the problem at that period, and in general with high-performance distributed atomic transactions over databases with different concurrency control types, has reduced. However, the need in concurrent distributed atomic transactions as a fundamental element of reliability exists in distributed systems also beyond database systems, and so the need in global serializability as a fundamental correctness criterion for such transactional systems (see also Distributed serializability in Serializability
Serializability
In concurrency control of databases, transaction processing , and various transactional applications , both centralized and distributed, a transaction schedule is serializable, has the serializability property, if its outcome In concurrency control of databases, transaction processing (transaction...

). With the proliferation of the Internet
Internet
The Internet is a global system of interconnected computer networks that use the standard Internet protocol suite to serve billions of users worldwide...

, Cloud computing
Cloud computing
Cloud computing is the delivery of computing as a service rather than a product, whereby shared resources, software, and information are provided to computers and other devices as a utility over a network ....

, Grid computing
Grid computing
Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common goal. The grid can be thought of as a distributed system with non-interactive workloads that involve a large number of files...

, small, portable, powerful computing devices (e.g., smartphone
Smartphone
A smartphone is a high-end mobile phone built on a mobile computing platform, with more advanced computing ability and connectivity than a contemporary feature phone. The first smartphones were devices that mainly combined the functions of a personal digital assistant and a mobile phone or camera...

s), and sophisticated systems management
Systems management
Systems management refers to enterprise-wide administration of distributed systems including computer systems. Systems management is strongly influenced by network management initiatives in telecommunications....

 the need for effective global serializability techniques to ensure correctness in and among distributed transactional applications seems to increase, and thus also the need in Commitment ordering (including the popular for databases special case SS2PL; SS2PL, though, does not meet the requirements of many other transactional objects).

The commitment ordering solution

Commitment ordering
Commitment ordering
In concurrency control of databases, transaction processing , and related applications, Commitment ordering is a class of interoperable Serializability techniques, both centralized and distributed. It allows optimistic implementations...

 (or Commit ordering; CO) is the only high-performance, fault tolerant, conflict serializability providing solution that has been proposed as a fully distributed (no central computing component or data-structure are needed), general mechanism that can be combined seamlessly with any local (to a database) concurrency control
Concurrency control
In information technology and computer science, especially in the fields of computer programming , operating systems , multiprocessors, and databases, concurrency control ensures that correct results for concurrent operations are generated, while getting those results as quickly as possible.Computer...

 mechanism (see technical summary). Since the CO property of a schedule is a necessary condition for global serializability of autonomous databases (in the context of concurrency control), it provides the only general solution for autonomous databases (i.e., if autonomous databases do not comply with CO, then global serializability may be violated). Seemingly by sheer luck, the CO solution possesses many attractive properties:
  1. does not interfere with any transaction's operation, particularly neither block, restrict nor delay any data-access operation (read or write) for either local or global
    Distributed transaction
    A distributed transaction is an operations bundle, in which two or more network hosts are involved. Usually, hosts provide transactional resources, while the transaction manager is responsible for creating and managing a global transaction that encompasses all operations against such resources...

     transactions (and thus does not cause any extra aborts); thus allows seamless integration with any concurrency control mechanism.
  2. allows optimistic
    Optimistic concurrency control
    In the field of relational database management systems, optimistic concurrency control is a concurrency control method that assumes that multiple transactions can complete without affecting each other, and that therefore transactions can proceed without locking the data resources that they affect...

     implementations (non-blocking, i.e., non data access blocking).
  3. allows heterogeneity: Global serializability is achieved across multiple transactional objects with different (any) concurrency control mechanisms, without interfering with the mechanisms' operations.
  4. allows modularity
    Modularity
    Modularity is a general systems concept, typically defined as a continuum describing the degree to which a system’s components may be separated and recombined. It refers to both the tightness of coupling between components, and the degree to which the “rules” of the system architecture enable the...

    : Transactional objects can be added and removed transparently.
  5. allows full ACID
    ACID
    In computer science, ACID is a set of properties that guarantee database transactions are processed reliably. In the context of databases, a single logical operation on the data is called a transaction...

     transaction support.
  6. maintains each database's autonomy, and does not need any concurrency control information distribution (e.g., local precedence relations, locks, timestamps, or tickets).
  7. does not need any knowledge about the transactions.
  8. requires no communication overhead since it only uses already needed, unmodified atomic commitment protocol messages (any such protocol; using fault tolerant atomic commitment protocols and database systems makes the CO solution fault tolerant).
  9. automatically resolves global deadlock
    Deadlock
    A deadlock is a situation where in two or more competing actions are each waiting for the other to finish, and thus neither ever does. It is often seen in a paradox like the "chicken or the egg"...

    s due to locking
    Lock (computer science)
    In computer science, a lock is a synchronization mechanism for enforcing limits on access to a resource in an environment where there are many threads of execution. Locks are one way of enforcing concurrency control policies.-Types:...

    .
  10. scales up
    Scalability
    In electronics scalability is the ability of a system, network, or process, to handle growing amount of work in a graceful manner or its ability to be enlarged to accommodate that growth...

     effectively with computer network
    Computer network
    A computer network, often simply referred to as a network, is a collection of hardware components and computers interconnected by communication channels that allow sharing of resources and information....

     size and number of databases, almost without any negative impact on performance, since each global transaction is typically confined to certain relatively small numbers of databases and network nodes.
  11. requires no additional, artificial transaction access operations (e.g., "take timestamp
    Timestamp-based concurrency control
    In computer science, in the field of databases, timestamp-based concurrency control is a non-lock concurrency control method, used in relational databases to safely handle transactions, using timestamps.-Assumptions:...

    " or "take ticket"), which typically result in additional, artificial conflicts that reduce concurrency.
  12. requires low overhead.


The only overhead incurred by the CO solution is locally detecting conflicts (which is already done by any known serializability mechanism, both pessimistic and optimistic) and locally ordering in each database system both the (local) commits of local transactions and the voting for atomic commitment of global transactions. Such overhead is low. The net effect of CO may be some delays of commit events (but never more delay than SS2PL, and on the average less). This makes CO instrumental for global concurrency control
Global concurrency control
Global concurrency control typically pertains to the concurrency control of a system comprising several components, each with its own concurrency control. The overall concurrency control of the whole system, the Global concurrency control, is determined by the concurrency control of its components,...

 of multidatabase systems (e.g., federated database system
Federated database system
A federated database system is a type of meta-database management system , which transparently integrates multiple autonomous database systems into a single federated database. The constituent databases are interconnected via a computer network and may be geographically decentralized...

s). The underlying Theory of Commitment ordering, part of Serializability
Serializability
In concurrency control of databases, transaction processing , and various transactional applications , both centralized and distributed, a transaction schedule is serializable, has the serializability property, if its outcome In concurrency control of databases, transaction processing (transaction...

 theory, is both sound and elegant (and even "mathematically beautiful"
Mathematical beauty
Many mathematicians derive aesthetic pleasure from their work, and from mathematics in general. They express this pleasure by describing mathematics as beautiful. Sometimes mathematicians describe mathematics as an art form or, at a minimum, as a creative activity...

; referring to structure and dynamics of conflicts, graph cycles, and deadlocks), with interesting implications for transactional distributed applications
Distributed computing
Distributed computing is a field of computer science that studies distributed systems. A distributed system consists of multiple autonomous computers that communicate through a computer network. The computers interact with each other in order to achieve a common goal...

.

All the qualities of CO in the list above, except the first three, are also possessed by SS2PL, which is a special case of CO, but blocking and constraining. This partially explains the popularity of SS2PL as a solution (practically, the only solution, for many years) for achieving global serializability. However, property 9 above, automatic resolution of global deadlocks, has not been noticed for SS2PL in the database research literature until today (2009; except in the CO publications). This, since the phenomenon of voting-deadlocks in such environments and their automatic resolution by the atomic commitment protocol has been overlooked.

Most existing database systems, including all major commercial database systems, are strong strict two phase locking (SS2PL) based and already CO compliant. Thus they can participate in a CO based solution for global serializability in multidatabase environments without any modification (except for the popular multiversioning
Multiversion concurrency control
Multiversion concurrency control , in the database field of computer science, is a concurrency control method commonly used by database management systems to provide concurrent access to the database and in programming languages to implement transactional memory.For instance, a database will...

, where additional CO aspects should be considered). Achieving global serializability across SS2PL based databases using atomic commitment (primarily using two phase commit, 2PC) has been employed for many years (i.e., using the same CO solution for a specific special case; however, no reference is known prior to CO, that notices this special case's automatic global deadlock resulotion by the atomic commitment protocol's augmented-conflict-graph global cycle elimination process). Virtually all existing distributed transaction processing environments and supporting products rely on SS2PL and provide 2PC. As a matter of fact SS2PL together with 2PC have become a de facto standard
De facto standard
A de facto standard is a custom, convention, product, or system that has achieved a dominant position by public acceptance or market forces...

. This solution is a homogeneous concurrency control one, suboptimal (when both Serializability
Serializability
In concurrency control of databases, transaction processing , and various transactional applications , both centralized and distributed, a transaction schedule is serializable, has the serializability property, if its outcome In concurrency control of databases, transaction processing (transaction...

 and Strictness are needed; see Strict commitment ordering; SCO) but still quite effective in most cases, sometimes at the cost of increased computing power needed relatively to the optimum. (However for better performance relaxed serializability is used whenever applications allow). It allows inter-operation among SS2PL-compliant different database system types, i.e., allows heterogeneity in aspects other than concurrency control. SS2PL is a very constraining schedule property, and "takes over" when combined with any other property. For example, when combined with any optimistic property, the result is not optimistic anymore, but rather characteristically SS2PL. On the other hand, CO does not change data-access scheduling patterns at all, and any combined property's characteristics remain unchanged. Since also CO uses atomic commitment (e.g., 2PC) for achieving global serializability, as SS2PL does, any CO compliant database system or transactional object can transparently join existing SS2PL based environments, use 2PC, and maintain global serializability without any environment change. This makes CO a straightforward, natural generalization of SS2PL for any conflict serializability based database system, for all practical purposes.

Commitment ordering has been quite widely known inside the transaction processing
Transaction processing
In computer science, transaction processing is information processing that is divided into individual, indivisible operations, called transactions. Each transaction must succeed or fail as a complete unit; it cannot remain in an intermediate state...

and database
Database
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...

s
communities at Digital Equipment Corporation
Digital Equipment Corporation
Digital Equipment Corporation was a major American company in the computer industry and a leading vendor of computer systems, software and peripherals from the 1960s to the 1990s...

(DEC) since 1990. It has been under company confidentiality due to patent
Patent
A patent is a form of intellectual property. It consists of a set of exclusive rights granted by a sovereign state to an inventor or their assignee for a limited period of time in exchange for the public disclosure of an invention....

ing
processes. CO was disclosed outside of DEC by lectures and technical reports' distribution to database researches in May 1991, immediately after its first patent filing. It has been misunderstood by many database researchers years after its introduction, which is evident by the quotes above from articles in 1997-1998 referencing Commitment ordering articles. On the other hand CO has been utilized extensively as a solution for global serializability in works on Transactional processes,

and more recently in the related Re:GRIDiT,
which is an approach for transaction management in the converging Grid computing
Grid computing
Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common goal. The grid can be thought of as a distributed system with non-interactive workloads that involve a large number of files...

 and Cloud computing
Cloud computing
Cloud computing is the delivery of computing as a service rather than a product, whereby shared resources, software, and information are provided to computers and other devices as a utility over a network ....

.
See more in The History of Commitment Ordering.

Relaxing global serializability

Some techniques have been developed for relaxed global serializability (i.e., they do not guarantee global serializability; see also Relaxing serializability). Among them (with several publications each):
  • Quasi serializability
  • Two-level serializability


While local (to a database system) relaxed serializability methods compromise serializability
Serializability
In concurrency control of databases, transaction processing , and various transactional applications , both centralized and distributed, a transaction schedule is serializable, has the serializability property, if its outcome In concurrency control of databases, transaction processing (transaction...

for performance gain (and are utilized only when the application can tolerate possible resulting inaccuracies, or its integrity is unharmed), it is unclear that various proposed relaxed global serializability methods which compromise global serializability, provide any performance gain over commitment ordering
Commitment ordering
In concurrency control of databases, transaction processing , and related applications, Commitment ordering is a class of interoperable Serializability techniques, both centralized and distributed. It allows optimistic implementations...

which guarantees global serializability. Typically, the declared intention of such methods has not been performance gain over effective global serializability methods (which apparently have been unknown to the inventors), but rather correctness criteria alternatives due to lack of a known effective global serializability method. Oddly, some of them were introduced years after CO had been introduced, and some even quote CO without realizing that it provides an effective global serializability solution, and thus without providing any performance comparison with CO to justify them as alternatives to global serializability for some applications (e.g., Two-level serializability). Two-level serializability is even presented as a major global concurrency control method in a 2010 edition of a text-book on databases (authored by two of the original authors of Two-level serializability, where one of them, Avi Silberschatz
Abraham Silberschatz
Silberschatz obtained his Ph.D. from the State University of New York at Stony Brook. He is the Sidney J. Weinberg Professor of Computer Science at Yale University. Prior to coming to Yale in 2003, he was the Vice President of the Information Sciences Research Center at Bell Labs...

, is also an author of the original Strong recoverability articles). This book neither mentions CO nor references it, and strangely, apparently does not consider CO a valid Global serializability solution.

Another common reason nowadays for Global serializability relaxation is the requirement of availability
Availability
In telecommunications and reliability theory, the term availability has the following meanings:* The degree to which a system, subsystem, or equipment is in a specified operable and committable state at the start of a mission, when the mission is called for at an unknown, i.e., a random, time...

 of internet
Internet
The Internet is a global system of interconnected computer networks that use the standard Internet protocol suite to serve billions of users worldwide...

 products and services
Internet service provider
An Internet service provider is a company that provides access to the Internet. Access ISPs directly connect customers to the Internet using copper wires, wireless or fiber-optic connections. Hosting ISPs lease server space for smaller businesses and host other people servers...

. This requirement is typically answered by large scale data replication
Replication (computer science)
Replication is the process of sharing information so as to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, fault-tolerance, or accessibility. It could be data replication if the same data is stored on multiple storage devices, or...

. The straightforward solution for synchronizing replicas' updates of a same database object is including all these updates in a single atomic distributed transaction
Distributed transaction
A distributed transaction is an operations bundle, in which two or more network hosts are involved. Usually, hosts provide transactional resources, while the transaction manager is responsible for creating and managing a global transaction that encompasses all operations against such resources...

. However, with many replicas such a transaction is very large, and may span several computer
Computer
A computer is a programmable machine designed to sequentially and automatically carry out a sequence of arithmetic or logical operations. The particular sequence of operations can be changed readily, allowing the computer to solve more than one kind of problem...

s and networks
Computer network
A computer network, often simply referred to as a network, is a collection of hardware components and computers interconnected by communication channels that allow sharing of resources and information....

 that some of them are likely to be unavailable. Thus such a transaction is likely to end with abort and miss its purpose.
Consequently Optimistic replication
Optimistic replication
Optimistic replication is a strategy for replication in which replicas are allowed to diverge.Traditional pessimistic replication systems try to guarantee from the beginning that all of the replicas are identical to each other, as if there was only a single copy of the data all along...

 (Lazy replication) is often utilized (e.g., in many products and services by Google
Google
Google Inc. is an American multinational public corporation invested in Internet search, cloud computing, and advertising technologies. Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program...

, Amazon
Amazon.com
Amazon.com, Inc. is a multinational electronic commerce company headquartered in Seattle, Washington, United States. It is the world's largest online retailer. Amazon has separate websites for the following countries: United States, Canada, United Kingdom, Germany, France, Italy, Spain, Japan, and...

, Yahoo, and alike), while Global serializability is relaxed and compromised for Eventual consistency
Eventual consistency
Eventual consistency is one of the consistency models used in the domain of parallel programming, for example in distributed shared memory, distributed transactions, and optimistic replication, it means that given a sufficiently long period of time over which no changes are sent, all updates can be...

. In this case relaxation is done only for applications that are not expected to be harmed by it.

Classes of schedules defined by relaxed global serializability properties either contain the global serializability class, or are incomparable with it. What differentiates techniques for relaxed global conflict serializability (RGCSR) properties from those of relaxed conflict serializability (RCSR) properties that are not RGCSR is typically the different way global cycles (span two or more databases) in the global conflict graph are handled. No distinction between global and local cycles exists for RCSR properties that are not RGCSR. RCSR contains RGCSR. Typically RGCSR techniques eliminate local cycles, i.e., provide local serializability (which can be achieved effectively by regular, known concurrency control
Concurrency control
In information technology and computer science, especially in the fields of computer programming , operating systems , multiprocessors, and databases, concurrency control ensures that correct results for concurrent operations are generated, while getting those results as quickly as possible.Computer...

 methods), however, obviously they do not eliminate all global cycles (which would achieve global serializability).

See also

  • Global concurrency control
    Global concurrency control
    Global concurrency control typically pertains to the concurrency control of a system comprising several components, each with its own concurrency control. The overall concurrency control of the whole system, the Global concurrency control, is determined by the concurrency control of its components,...

  • Serializability
    Serializability
    In concurrency control of databases, transaction processing , and various transactional applications , both centralized and distributed, a transaction schedule is serializable, has the serializability property, if its outcome In concurrency control of databases, transaction processing (transaction...

  • Commitment ordering
    Commitment ordering
    In concurrency control of databases, transaction processing , and related applications, Commitment ordering is a class of interoperable Serializability techniques, both centralized and distributed. It allows optimistic implementations...

  • The History of Commitment Ordering
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK