In
computer scienceComputer science is the study of the theoretical foundations of information and computation, and of practical techniques for their implementation and application in computer systems. It is frequently described as the systematic study of algorithmic processes that create, describe and transform...
,
transaction processing is information processing that is divided into individual, indivisible operations, called
transactions. Each transaction must succeed or fail as a complete unit; it cannot remain in an intermediate state.
Transaction processing is designed to maintain a computer system (typically, but not limited to, a
databaseA database is an integrated collection of logically related records or files consolidated into a common pool that provides data for one or more multiple uses....
or some modern filesystems) in a known, consistent state, by ensuring that any operations carried out on the system that are interdependent are either all completed successfully or all canceled successfully.
For example, consider a typical banking transaction that involves moving $700 from a customer's savings account to a customer's checking account.
In
computer scienceComputer science is the study of the theoretical foundations of information and computation, and of practical techniques for their implementation and application in computer systems. It is frequently described as the systematic study of algorithmic processes that create, describe and transform...
,
transaction processing is information processing that is divided into individual, indivisible operations, called
transactions. Each transaction must succeed or fail as a complete unit; it cannot remain in an intermediate state.
Description
Transaction processing is designed to maintain a computer system (typically, but not limited to, a
databaseA database is an integrated collection of logically related records or files consolidated into a common pool that provides data for one or more multiple uses....
or some modern filesystems) in a known, consistent state, by ensuring that any operations carried out on the system that are interdependent are either all completed successfully or all canceled successfully.
For example, consider a typical banking transaction that involves moving $700 from a customer's savings account to a customer's checking account. This transaction is a single operation in the eyes of the bank, but it involves at least two separate operations in computer terms: debiting the savings account by $700, and crediting the checking account by $700. If the debit operation succeeds but the credit does not (or
vice versa), the books of the bank will not balance at the end of the day. There must therefore be a way to ensure that either both operations succeed or both fail, so that there is never any inconsistency in the bank's database as a whole. Transaction processing is designed to provide this.
Transaction processing allows multiple individual operations to be linked together automatically as a single, indivisible transaction. The transaction-processing system ensures that either all operations in a transaction are completed without error, or none of them are. If some of the operations are completed but errors occur when the others are attempted, the transaction-processing system “rolls back”
all of the operations of the transaction (including the successful ones), thereby erasing all traces of the transaction and restoring the system to the consistent, known state that it was in before processing of the transaction began. If all operations of a transaction are completed successfully, the transaction is
committedIn the context of computer science and data management, commit refers to the idea of making a set of tentative changes permanent. A popular usage is at the end of a transaction...
by the system, and all changes to the database are made permanent; the transaction cannot be rolled back once this is done.
Transaction processing guards against hardware and software errors that might leave a transaction partially completed, with the system left in an unknown, inconsistent state. If the computer system crashes in the middle of a transaction, the transaction processing system guarantees that all operations in any
uncommitted (
i.e., not completely processed) transactions are cancelled.
Transactions are processed in a strict chronological order. If transaction
n+1 intends to touch the same portion of the database as transaction
n, transaction
n+1 does not begin until transaction
n is committed. Before any transaction is committed, all other transactions affecting the same part of the system must also be committed; there can be no “holes” in the sequence of preceding transactions.
Methodology
The basic principles of all transaction-processing systems are the same. However, the terminology may vary from one transaction-processing system to another, and the terms used below are not necessarily universal.
Rollback
Transaction-processing systems ensure database integrity by recording intermediate states of the database as it is modified, then using these records to restore the database to a known state if a transaction cannot be committed. For example, copies of information on the database
prior to its modification by a transaction are set aside by the system before the transaction can make any modifications (this is sometimes called a
before image). If any part of the transaction fails before it is committed, these copies are used to restore the database to the state it was in before the transaction began (
rollback).
Rollforward
It is also possible to keep a separate journal of all modifications to a database (sometimes called
after images); this is not required for rollback of failed transactions, but it is useful for updating the database in the event of a database failure, so some transaction-processing systems provide it. If the database fails entirely, it must be restored from the most recent back-up. The back-up will not reflect transactions committed since the back-up was made. However, once the database is restored, the journal of after images can be applied to the database (
rollforward) to bring the database up to date. Any transactions in progress at the time of the failure can then be rolled back. The result is a database in a consistent, known state that includes the results of all transactions committed up to the moment of failure.
Deadlocks
In some cases, two transactions may, in the course of their processing, attempt to access the same portion of a database at the same time, in a way that prevents them from proceeding. For example, transaction A may access portion X of the database, and transaction B may access portion Y of the database. If, at that point, transaction A then tries to access portion Y of the database while transaction B tries to access portion X, a
deadlock occurs, and neither transaction can move forward. Transaction-processing systems are designed to detect these deadlocks when they occur. Typically both transactions will be cancelled and rolled back, and then they will be started again in a different order, automatically, so that the deadlock doesn't occur again. Or sometimes, just one of the deadlocked transactions will be cancelled, rolled back, and automatically re-started after a short delay.
Deadlocks can also occur between three or more transactions. The more transactions involved, the more difficult they are to
detect, to the point that transaction processing systems find there is a practical limit to the deadlocks they can detect.
ACID criteria (Atomicity, Consistency, Isolation, Durability)
Transaction processing has these benefits:
- It allows sharing of computer resources among many users
- It shifts the time of job processing to when the computing resources are less busy
- It avoids idling the computing resources without minute-by-minute human interaction and supervision
- It is used on expensive classes of computers to help amortize the cost by keeping high rates of utilization of those expensive resources
- A transaction is an atomic unit of processing.
Implementations
Standard transaction-processing software, notably
IBMInternational Business Machines Corporation, abbreviated IBM, is a multinational computer technology and IT consulting corporation headquartered in Armonk, Town of North Castle, New York, United States. The company is one of the few information technology companies with a continuous history dating...
's
Information Management SystemIBM Information Management System is a joint hierarchical database and information management system with extensive transaction processing capabilities.- History :...
, was first developed in the 1960s, and was often closely coupled to particular
database management systemA Database Management System is a set of computer programs that controls the creation, maintenance, and the use of the database in a computer platform or of an organization and its end users. It allows organizations to place control of organization-wide database development in the hands of...
s. Client-server computing implemented similar principles in the 1980s with mixed success. However, in more recent years, the distributed client-server model has become considerably more difficult to maintain. As the number of transactions grew in response to various online services (especially the Web), a single distributed database was not a practical solution. In addition, most online systems consist of a whole suite of programs operating together, as opposed to a strict client-server model where the single server could handle the transaction processing. Today a number of transaction processing systems are available that work at the inter-program level and which scale to large systems, including
mainframesMainframes are computers used mainly by large organizations for critical applications, typically bulk data processing such as census, industry and consumer statistics, enterprise resource planning, and financial transaction processing.The term probably had originated from the early mainframes, as...
.
An important open industry standard is the X/Open Distributed Transaction Processing (DTP) (see
JTAThe Java Transaction API is one of the Java EE APIs allowing distributed transactions to be done across multiple XA resources. JTA is a specification developed under the Java Community Process as JSR 907...
). However, proprietary transaction-processing environments such as IBM's
CICSCICS is a transaction server that runs primarily on IBM mainframe systems under z/OS and z/VSE.CICS is a transaction manager designed for rapid, high-volume online processing...
are still very popular, although CICS has evolved to include open industry standards as well.
A modern transaction processing implementation combines elements of both object-oriented persistence with traditional transaction monitoring. One such implementation is the commercial DTS/S1 product from Obsidian Dynamics.
See also
- ACID
In computer science, ACID is a set of properties that guarantee that database transactions are processed reliably. In the context of databases, a single logical operation on the data is called a transaction...
- ACMS
Application Control Management System is a transaction processing monitor software system for computers running the OpenVMS operating system....
- Audit trail
Audit trail or audit log is a chronological sequence of audit records, each of which contains evidence directly pertaining to and resulting from the execution of a business process or system function....
- CICS
CICS is a transaction server that runs primarily on IBM mainframe systems under z/OS and z/VSE.CICS is a transaction manager designed for rapid, high-volume online processing...
- IBM TXSeries
TXSeries for Multiplatforms is a distributed CICS Online Transaction Processing environment for mixed language applications. It is widely used for integrating data and applications between distributed solutions and enterprise systems, and the deployment of CICS applications written in COBOL, C ,...
(CICS on distributed platforms)
- Database transaction
A database transaction comprises a unit of work performed within a database management system against a database, and treated in a coherent and reliable way independent of other transactions...
- Extreme Transaction Processing
Extreme Transaction Processing is an exceptionally demanding form of transaction processing. Transactions of 10,000 concurrent accesses or more would require this form of processing.- Description :...
(XTP)
- IMS
IBM Information Management System is a joint hierarchical database and information management system with extensive transaction processing capabilities.- History :...
- Java EE (e.g. WebSphere
IBM WebSphere refers to a brand of software products, although the term also popularly refers to one specific product: IBM WebSphere Application Server . WebSphere is designed to set up, operate and integrate electronic business applications across multiple computing platforms, using Java-based Web...
Application Server)
- Java Transaction API
The Java Transaction API is one of the Java EE APIs allowing distributed transactions to be done across multiple XA resources. JTA is a specification developed under the Java Community Process as JSR 907...
(JTA)
- Two-phase commit
- Transaction Processing Facility
TPF is an IBM real-time operating system for mainframes descended from the IBM System/360 family, including zSeries and System z9. The name is an initialism for Transaction Processing Facility....
- Tuxedo (software)
Tuxedo is a middleware platform used to manage distributed transaction processing in distributed computing environments...
External References
Further reading
- Gerhard Weikum, Gottfried Vossen, Transactional information systems: theory, algorithms, and the practice of concurrency control and recovery, Morgan Kaufmann, 2002, ISBN 1558605088
- Jim Gray, Andreas Reuter, Transaction Processing - Concepts and Techniques, 1993, Morgan Kaufmann, ISBN 1-55860-190-2
- Philip A. Bernstein, Eric Newcomer, Principles of Transaction Processing, 1997, Morgan Kaufmann, ISBN 1-55860-415-4
- Ahmed K. Elmagarmid (Editor), Transaction Models for Advanced Database Applications, Morgan-Kaufmann, 1992, ISBN 1-55860-214-3