Batch processing
Encyclopedia
Batch processing is execution of a series of program
Computer program
A computer program is a sequence of instructions written to perform a specified task with a computer. A computer requires programs to function, typically executing the program's instructions in a central processor. The program has an executable form that the computer can use directly to execute...

s ("job
Process (computing)
In computing, a process is an instance of a computer program that is being executed. It contains the program code and its current activity. Depending on the operating system , a process may be made up of multiple threads of execution that execute instructions concurrently.A computer program is a...

s") on a computer
Computer
A computer is a programmable machine designed to sequentially and automatically carry out a sequence of arithmetic or logical operations. The particular sequence of operations can be changed readily, allowing the computer to solve more than one kind of problem...

 without manual intervention.

Batch jobs are set up so they can be run to completion without manual intervention, so all input data is preselected through scripts or command-line parameters. This is in contrast to "online" or interactive programs
Interactive computing
In computer science, interactive computing refers to software which accepts input from humans — for example, data or commands. Interactive software includes most popular programs, such as word processors or spreadsheet applications. By comparison, noninteractive programs operate without human...

 which prompt the user for such input.
A program takes a set of data files as input, processes the data, and produces a set of output data files. This operating environment is termed as "batch processing" because the input data are collected into batches
At (Unix)
In Unix-like computer operating systems,the at commandis used to schedule commands to be executed once, at a particular time in the future....

 of files and are processed in batches by the program.

Benefits

Batch processing has these benefits:
  • It allows sharing of computer resources among many users and programs.
  • It shifts the time of job processing to when the computing resources are less busy.
  • It avoids idling the computing resources with minute-by-minute manual intervention and supervision.
  • By keeping high overall rate of utilization, it better amortizes the cost of a computer, especially an expensive one.

History

Batch processing has been associated with mainframe computer
Mainframe computer
Mainframes are powerful computers used primarily by corporate and governmental organizations for critical applications, bulk data processing such as census, industry and consumer statistics, enterprise resource planning, and financial transaction processing.The term originally referred to the...

s since the earliest days of electronic computing in the 1950s. There were a variety of reasons why batch processing dominated early computing. One reason is that the most urgent business problems for reasons of profitability and competitiveness were primarily accounting problems, such as billing. Billing is inherently a batch-oriented business process, and practically every business must bill, reliably and on-time. Also, every computing resource was expensive, so sequential submission of batch jobs matched the resource constraints and technology evolution at the time. Later, interactive sessions
Read-eval-print loop
A read–eval–print loop , also known as an interactive toplevel, is a simple, interactive computer programming environment. The term is most usually used to refer to a Lisp interactive environment, but can be applied to command line shells and similar environments for F#, Smalltalk, Standard ML,...

 with either text-based computer terminal
Computer terminal
A computer terminal is an electronic or electromechanical hardware device that is used for entering data into, and displaying data from, a computer or a computing system...

 interfaces or graphical user interface
Graphical user interface
In computing, a graphical user interface is a type of user interface that allows users to interact with electronic devices with images rather than text commands. GUIs can be used in computers, hand-held devices such as MP3 players, portable media players or gaming devices, household appliances and...

s became more common. However, computers initially were not even capable of having multiple programs
Multiprogramming
Computer multiprogramming is the allocation of a computer system and its resources to more than one concurrent application, job or user ....

 loaded into the main memory.

Batch processing is still pervasive in mainframe computing, but practically all types of computers are now capable of at least some batch processing, even if only for "housekeeping" tasks. That includes UNIX
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...

-based computers, Microsoft Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

, Mac OS X
Mac OS X
Mac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...

, and even smartphones, increasingly. Virus scanning is a form of batch processing, and so are scheduled jobs that periodically delete temporary files that are no longer required. E-mail systems frequently have batch jobs that periodically archive and compress old messages. As computing in general becomes more pervasive in society and in the world, so too will batch processing.

Modern systems

Despite their long history, batch applications are still critical in most organizations in large part because many core business processes are inherently batch-oriented. (Billing is a notable example that nearly every business requires to function.) While online systems can also function when manual intervention is not desired, they are not typically optimized to perform high-volume, repetitive tasks. Therefore, even new systems usually contain one or more batch applications for updating information at the end of the day, generating reports, printing documents, and other non-interactive tasks that must complete reliably within certain business deadlines.

Modern batch applications make use of modern batch frameworks such as Spring Batch
Spring Batch
Spring Batch is an open source framework for batch processing. It is a lightweight, comprehensive solution designed to enable the development of robust batch applications vital for the daily operations of enterprise systems...

, which is written for Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

, and other frameworks for other programming languages, to provide the fault tolerance and scalability
Scalability
In electronics scalability is the ability of a system, network, or process, to handle growing amount of work in a graceful manner or its ability to be enlarged to accommodate that growth...

 required for high-volume processing. In order to ensure high-speed processing, batch applications are often integrated with grid computing
Grid computing
Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common goal. The grid can be thought of as a distributed system with non-interactive workloads that involve a large number of files...

 solutions to partition
Partition of a set
In mathematics, a partition of a set X is a division of X into non-overlapping and non-empty "parts" or "blocks" or "cells" that cover all of X...

 a batch job over a large number of processors, although there are significant programming challenges in doing so. High volume batch processing places particularly heavy demands on system and application architectures as well. Architectures that feature strong input/output
Input/output
In computing, input/output, or I/O, refers to the communication between an information processing system , and the outside world, possibly a human, or another information processing system. Inputs are the signals or data received by the system, and outputs are the signals or data sent from it...

 performance and vertical scalability
Scalability
In electronics scalability is the ability of a system, network, or process, to handle growing amount of work in a graceful manner or its ability to be enlarged to accommodate that growth...

, including modern mainframe computers, tend to provide better batch performance than alternatives.

Scripting languages became popular as they evolved along with batch processing.

Data processing

A typical batch processing schedule includes end of day
End Of Day
End of day, end of business, close of business or COB is the end of the trading day in financial markets, the point when trading ceases. In some markets it is actually defined as the point in time a few minutes prior to the actual cessation of trading, when the regular traders' orders are no longer...

- reporting (EOD). Historically, many systems had a batch window where online subsystems were turned off and the system capacity was used to run jobs common to all data (accounts, users, or customers) on a system. In a bank, for example, EOD jobs include interest calculation, generation of reports and data sets to other systems, printing (statements), and payment processing. Many businesses have moved to concurrent online and batch architectures in order to support globalization
Globalization
Globalization refers to the increasingly global relationships of culture, people and economic activity. Most often, it refers to economics: the global distribution of the production of goods and services, through reduction of barriers to international trade such as tariffs, export fees, and import...

, the Internet
Internet
The Internet is a global system of interconnected computer networks that use the standard Internet protocol suite to serve billions of users worldwide...

, and other relatively newer business demands. Such architectures place unique stresses on system design, programming techniques, availability engineering, and IT service delivery.

Databases

Batch processing is also used for efficient bulk database updates and automated transaction processing
Transaction processing
In computer science, transaction processing is information processing that is divided into individual, indivisible operations, called transactions. Each transaction must succeed or fail as a complete unit; it cannot remain in an intermediate state...

, as contrasted to interactive online transaction processing
Online transaction processing
Online transaction processing, or OLTP, refers to a class of systems that facilitate and manage transaction-oriented applications, typically for data entry and retrieval transaction processing...

 (OLTP) applications. The extract, transform, load
Extract, transform, load
Extract, transform and load is a process in database usage and especially in data warehousing that involves:* Extracting data from outside sources* Transforming it to fit operational needs...

 (ETL) step in populating data warehouses is inherently a batch process in most implementations.

Images

Batch processing is often used to perform various operations with digital image
Digital image
A digital image is a numeric representation of a two-dimensional image. Depending on whether or not the image resolution is fixed, it may be of vector or raster type...

s. There exist computer programs that let one resize, convert, watermark, or otherwise edit image files.

Converting

Batch processing is also used for converting a number of computer files from one format to another. This is to make files portable and versatile especially for proprietary and legacy files where viewers are not easy to come by.

Notable batch scheduling and execution environments

UNIX utilizes cron and at facilities to allow for scheduling of complex job scripts.
Windows has a job scheduler
Job scheduler
A job scheduler is a software application that is in charge of unattended background executions, commonly known for historical reasons as batch processing....

. Most high-performance computing
High-performance computing
High-performance computing uses supercomputers and computer clusters to solve advanced computation problems. Today, computer systems approaching the teraflops-region are counted as HPC-computers.-Overview:...

 clusters
Cluster (computing)
A computer cluster is a group of linked computers, working together closely thus in many respects forming a single computer. The components of a cluster are commonly, but not always, connected to each other through fast local area networks...

 use batch processing to maximize cluster usage.

IBM's z/OS
Z/OS
z/OS is a 64-bit operating system for mainframe computers, produced by IBM. It derives from and is the successor to OS/390, which in turn followed a string of MVS versions.Starting with earliest:*OS/VS2 Release 2 through Release 3.8...

 has arguably the most highly refined and evolved set of batch processing facilities owing to its origins, long history, and continuing evolution, and today such systems commonly support hundreds or even thousands of concurrent online and batch tasks within a single operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

 image. Mainframe-unique technologies that aid concurrent batch and online processing include Job Control Language
Job Control Language
Job Control Language is a scripting language used on IBM mainframe operating systems to instruct the system on how to run a batch job or start a subsystem....

 (JCL), scripting languages such as REXX
REXX
REXX is an interpreted programming language that was developed at IBM. It is a structured high-level programming language that was designed to be both easy to learn and easy to read...

, Job Entry Subsystem (JES2 and JES3), Workload Manager
Workload Manager
In IBM mainframes, Workload Manager is a base component of MVS/ESA mainframe operating system, and its successors up to and including z/OS. It controls the access to system resources for the work executing on z/OS based on administrator-defined goals. Workload Manager components also exist for...

 (WLM), Automatic Restart Manager (ARM), Resource Recovery Services (RRS), DB2
IBM DB2
The IBM DB2 Enterprise Server Edition is a relational model database server developed by IBM. It primarily runs on Unix , Linux, IBM i , z/OS and Windows servers. DB2 also powers the different IBM InfoSphere Warehouse editions...

 data sharing, Parallel Sysplex
Parallel Sysplex
In computing, a Parallel Sysplex is a cluster of IBM mainframes acting together as a single system image with z/OS. Used for disaster recovery, Parallel Sysplex combines data sharing and parallel computing to allow a cluster of up to 32 systems to share a workload for high performance and high...

, unique performance optimizations such as HiperDispatch
HiperDispatch
HiperDispatch is a workload dispatching feature found in the newest IBM mainframe models running recent releases of z/OS. HiperDispatch was introduced in February, 2008....

, I/O channel architecture, and several others.

See also

  • Batch renaming
    Batch renaming
    Batch renaming is a form of batch processing used to rename multiple computer files and folders in an automated fashion, in order to save time and reduce the amount of work involved. Some sort of software is required to do this. Such software can be more or less advanced, but most have the same...

     - to rename lots of files automatically without human intervention, in order to save time and effort
  • Batch-queuing system - for schedulers that plan the execution of batch jobs
  • Job Processing Cycle
    Job processing cycle
    In mainframe operating systems, when a job is submitted, the operating system takes the job through a series of steps known as the job processing cycle. A large number of jobs is usually executed in a given time period, such as night, which is a scheme historically known as batch processing...

     - for detailed description of batch processing in the mainframe field
  • BatchPipes
    BatchPipes
    In IBM mainframes, BatchPipes is a batch job processing utility designed for the MVS/ESA operating system, and all later incarnations—OS/390 and z/OS.-Core function:...

    - for utility that increases batch performance

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK