All Topics  
Analysis of algorithms

 

   Email Print
   Bookmark   Link






 

Analysis of algorithms



 
 
To analyze an algorithm
Algorithm

In mathematics, computing, linguistics and related subjects, an algorithm is a sequence of finite instructions, often used for calculation and data processing....
 is to determine the amount of resources (such as time and storage) necessary to execute it. Most algorithms are designed to work with inputs of arbitrary length. Usually the efficiency or complexity
Computational complexity theory

Computational complexity theory, as a branch of the theory of computation in computer science, investigates the problems related to the Computational resource required for the execution of algorithms , and the inherent difficulty in providing efficient algorithms for specific computational problems....
 of an algorithm is stated as a function relating the input length
Problem size

In the fields of Analysis of algorithms and computational complexity theory, the running time or space requirements of an algorithm are expressed as a function of the problem size....
 to the number of steps (time complexity) or storage locations (space complexity).

Algorithm analysis is an important part of a broader computational complexity theory
Computational complexity theory

Computational complexity theory, as a branch of the theory of computation in computer science, investigates the problems related to the Computational resource required for the execution of algorithms , and the inherent difficulty in providing efficient algorithms for specific computational problems....
, which provides theoretical estimates for the resources needed by any algorithm which solves a given computational problem.






Discussion
Ask a question about 'Analysis of algorithms'
Start a new discussion about 'Analysis of algorithms'
Answer questions from other users
Full Discussion Forum



Encyclopedia


To analyze an algorithm
Algorithm

In mathematics, computing, linguistics and related subjects, an algorithm is a sequence of finite instructions, often used for calculation and data processing....
 is to determine the amount of resources (such as time and storage) necessary to execute it. Most algorithms are designed to work with inputs of arbitrary length. Usually the efficiency or complexity
Computational complexity theory

Computational complexity theory, as a branch of the theory of computation in computer science, investigates the problems related to the Computational resource required for the execution of algorithms , and the inherent difficulty in providing efficient algorithms for specific computational problems....
 of an algorithm is stated as a function relating the input length
Problem size

In the fields of Analysis of algorithms and computational complexity theory, the running time or space requirements of an algorithm are expressed as a function of the problem size....
 to the number of steps (time complexity) or storage locations (space complexity).

Algorithm analysis is an important part of a broader computational complexity theory
Computational complexity theory

Computational complexity theory, as a branch of the theory of computation in computer science, investigates the problems related to the Computational resource required for the execution of algorithms , and the inherent difficulty in providing efficient algorithms for specific computational problems....
, which provides theoretical estimates for the resources needed by any algorithm which solves a given computational problem. These estimates provide an insight into reasonable directions of search of efficient algorithms.

In theoretical analysis of algorithms it is common to estimate their complexity in asymptotic sense, i.e., to estimate the complexity function for reasonably large length of input. Big O notation
Big O notation

In mathematics, big O notation describes the asymptotic analysis of a function when the argument tends towards a particular value or infinity, usually in terms of simpler functions....
, omega
Big O notation

In mathematics, big O notation describes the asymptotic analysis of a function when the argument tends towards a particular value or infinity, usually in terms of simpler functions....
 notation and theta
Big O notation

In mathematics, big O notation describes the asymptotic analysis of a function when the argument tends towards a particular value or infinity, usually in terms of simpler functions....
 notation are used to this end. For instance, binary search is said to run an amount of steps proportional to a logarithm, or in O(log(n)), colloquially "in logarithmic time". Usually asymptotic estimates are used because different implementation
Implementation

Implementation is the realization of an application, or execution of a plan, idea, model, design, specification, Standardization, algorithm, or policy....
s of the same algorithm may differ in efficiency. However the efficiencies of any two "reasonable" implementations of a given algorithm are related by a constant multiplicative factor called hidden constant.

Exact (not asymptotic) measures of efficiency can sometimes be computed but they usually require certain assumptions concerning the particular implementation of the algorithm, called model of computation
Model of computation

In computability theory and computational complexity theory, a model of computation is the definition of the set of allowable operations used in computation and their respective costs....
. A model of computation may be defined in terms of an abstract computer
Abstract machine

An abstract machine, also called an abstract computer, is a theoretical model of a computer hardware or software system used in Automata theory....
, e.g., Turing machine
Turing machine

Turing machines are basic abstract symbol-manipulating devices which, despite their simplicity, can be adapted to simulate the logic of any computer algorithm....
, and/or by postulating that certain operations are executed in unit time. For example, if the sorted set to which we apply binary search has N elements, and we can guarantee that a single binary lookup can be done in unit time, then at most log2 N + 1 time units are needed to return an answer.

Exact measures of efficiency are useful to the people who actually implement and use algorithms, because they are more precise and thus enable them to know how much time they can expect to spend in execution. To some people (e.g. game programmers), a hidden constant can make all the difference between success and failure.

Time efficiency estimates depend on what we define to be a step. For the analysis to make sense, the time required to perform a step must be guaranteed to be bounded above by a constant. One must be careful here; for instance, some analyses count an addition of two numbers as a step. This assumption may not be warranted in certain contexts. For example, if the numbers involved in a computation may be arbitrarily large, addition no longer can be assumed to require constant time (compare the time you need to add two 2-digit integers and two 1000-digit integers using a pen and paper).

Run-time analysis

Run-time analysis is a theoretical classification that estimates and anticipates the increase in running time
DTIME

In computational complexity theory, DTIME is the computational resource of computation time for a deterministic Turing machine. It represents the amount of time that a "normal" physical computer would take to solve a certain computational problem using a certain algorithm....
 (or run-time) of an algorithm
Algorithm

In mathematics, computing, linguistics and related subjects, an algorithm is a sequence of finite instructions, often used for calculation and data processing....
 as its input size
Input

Input is the term denote either an entrance or changes which are inserted into a system and which activate/modify a process. It is an abstract concept, used in the model ing, system design and system exploitation....
 (usually denoted as n) increases. Run-time efficiency is a topic of great interest in Computer Science
Computer science

Computer science is the study of the theoretical foundations of information and computation, and of practical techniques for their implementation and application in computer systems....
: A program
Computer program

Computer programs are Instruction for a computer. A computer requires programs to function. Moreover, a computer program does not run unless its instructions are executed by a Central processing unit; however, a program may communicate an Algorithm#Formalization of algorithms to people without running....
 can take seconds, hours or even years to finish executing, depending on which algorithm it implements (see also performance analysis
Performance analysis

In software engineering, performance analysis, more commonly today known as profiling, is the investigation of a program's behavior using information gathered as the program executes ....
, which is the analysis of an algorithm's run-time in practice).

Shortcomings of empirical metrics


Since algorithms are platform-independent (i.e. a given algorithm can be implemented in an arbitrary programming language
Programming language

A programming language is a machine-readable artificial language designed to express computations that can be performed by a machine, particularly a computer....
 on an arbitrary computer
Computer

A computer is a machine that manipulates Data according to a list of Code .The first devices that resemble modern computers date to the mid-20th century , although the computer concept and various machines similar to computers existed earlier....
 running an arbitrary operating system
Operating system

An operating system is an interface between hardware and applications; it is responsible for the management and coordination of activities and the sharing of the limited resources of the computer....
), there are significant drawbacks to using an empirical
Empirical

The word empirical denotes information gained by means of observation, experience, or experiment, as opposed to theory. A central concept in science and the scientific method is that all evidence must be empirical, or empirically based, that is, dependent on evidence or Logical consequence that are observable by the senses....
 approach to gauge the comparative performance of a given set of algorithms.

Take as an example a program that looks up a specific entry in a sorted
Collation

Collation is the assembly of written information into a standard order. One common type of collation is called alphabetisation, though collation is not limited to ordering letters of the alphabet....
 list
List (computing)

In computer science, a list is an ordered Multiset of entity/items.In the context of object-oriented programming languages, a list is defined as an instance of an abstract data type , formalizing the concept of an order theoryed Collection class of entity....
 of size n. Suppose this program were implemented on Computer A, a state-of-the-art machine, using a linear search
Linear search

In computer science, linear search is a search algorithm, also known as sequential search, that is suitable for searching a list of data for a particular value....
 algorithm, and on Computer B, a much slower machine, using a binary search algorithm. Benchmark testing
Benchmark (computing)

In computing, a benchmark is the act of running a computer program, a set of programs, or other operations, in order to assess the relative performance of an object, normally by running a number of standard tests and trials against it....
 on the two computers running their respective programs might look something like the following:

n (list size) Computer A run-time
(in nanoseconds)
Computer B run-time
(in nanoseconds)
15 7 ns 100,000 ns
65 32 ns 150,000 ns
250 128 ns 200,000 ns
1,000 500 ns 250,000 ns


Based on these metrics
Metrics

A metric is a standard unit of measure, such as meter or mile for length, or gram or ton for weight, or more generally, part of a system of parameters, or systems of measurement, or a set of ways of quantitatively and periodically measuring, assessing, controlling or selecting a person, process, event, or institution, along with the procedure...
, it would be easy to jump to the conclusion that Computer A is running an algorithm that is far superior in efficiency to what Computer B is running. However, if the size of the input-list is increased to a sufficient number, that conclusion is dramatically demonstrated to be in error:

n (list size) Computer A run-time
(in nanoseconds)
Computer B run-time
(in nanoseconds)
15 7 ns 100,000 ns
65 32 ns 150,000 ns
250 125 ns 200,000 ns
1,000 500 ns 250,000 ns
... ... ...
1,000,000 500,000 ns 500,000 ns
4,000,000 2,000,000 ns 550,000 ns
16,000,000 8,000,000 ns 600,000 ns
... ... ...
63,072 × 1012 31,536 × 1012 ns,
or 1 year
1,375,000 ns,
or 1.375 milliseconds


Computer A, running the linear search program, exhibits a linear
Linear

The word linear comes from the Latin word linearis, which means created by lines.In mathematics, a linear map or function f is a function which satisfies the following two properties......
 growth rate
Growth rate

Growth rate may refer to:*Exponential growth, a growth rate classification*Compound annual growth rate or CAGR, a measure of financial growth...
. The program's run-time is directly proportional to its input size. Doubling the input size doubles the run time, quadrupling the input size quadruples the run-time, and so forth. Whereas Computer B, running the binary search program, exhibits a logarithm
Logarithm

In mathematics, the logarithm of a number to a given base is the Power or exponent to which the base must be raised in order to produce the number....
ic growth rate. Doubling the input size only increases the run time by a constant amount (in this example, 25,000 ns). Even though Computer A is ostensibly a faster machine, Computer B will inevitably surpass Computer A in run-time because it's running an algorithm with a much slower growth rate.

Orders of growth

Informally, an algorithm can be said to exhibit a growth rate on the order of a mathematical function
Function (mathematics)

The mathematical concept of a function expresses dependence between two quantities, one of which is known and the other which is produced. A function associates a single output to each input element drawn from a fixed Set , such as the real numbers , although different inputs may have the same output....
 if beyond a certain input size n, the function f(n) times a positive constant provides an upper bound or limit
Asymptotic analysis

In pure mathematics and applied mathematics, particularly the analysis of algorithms, real analysis, and engineering, asymptotic analysis is a method of describing Limit ing behaviour....
 for the run-time of that algorithm. In other words, for a given input size n greater than some no and a constant c, an algorithm can run no slower than c × f(n). This concept is frequently expressed using Big O notation (see Big O notation
Big O notation

In mathematics, big O notation describes the asymptotic analysis of a function when the argument tends towards a particular value or infinity, usually in terms of simpler functions....
 for a more formal discussion
). For example, since the run-time of insertion sort
Insertion sort

Insertion sort is a simple sorting algorithm, a comparison sort in which the sorted array is built one entry at a time. It is much less efficient on large lists than more advanced algorithms such as quicksort, heapsort, or merge sort....
 grows quadratically
Quadratic growth

In mathematics, a function or sequence is said to exhibit quadratic growth when its values are proportionality to the squaring of the function argument or sequence position, in the limit as the argument or sequence position goes to infinity....
 as its input size increases, insertion sort can be said to be of order O(nē).

Big O notation is a convenient way to express the worst-case scenario
Best, worst and average case

In computer science, best, worst and average cases of a given algorithm express what the Resource usage is at least, at most and on average, respectively....
 for a given algorithm, although it can also be used to express the average-case — for example, the worst-case scenario for quicksort
Quicksort

Quicksort is a well-known sorting algorithm developed by C. A. R. Hoare that, average performance, makes comparisons to sort n items. However, in the Best, worst and average case, it makes comparisons....
 is O(nē), but the average-case run-time is O(n lg n).

Evaluating run-time complexity

The run-time complexity for the worst-case scenario of a given algorithm can sometimes be evaluated by examining the structure of the algorithm and making some simplifying assumptions. Consider the following pseudocode
Pseudocode

Pseudocode is a compact and informal high-level description of a computer programming algorithm that uses the structural conventions of some programming language, but is intended for human reading rather than machine reading....
:

1 get a positive integer from input 2 if n > 10 3print "This might take a while..." 4 for i = 1 to n 5for j = 1 to i 6 print i * j 7 print "Done!"

A given computer will take a discrete amount of time
DTIME

In computational complexity theory, DTIME is the computational resource of computation time for a deterministic Turing machine. It represents the amount of time that a "normal" physical computer would take to solve a certain computational problem using a certain algorithm....
 to execute each of the instructions
Instruction (computer science)

In computer science, an instruction is a single operation of a central processing unit defined by an instruction set architecture. In a broader sense, an "instruction" may be any representation of an element of an executable program, such as a bytecode....
 involved with carrying out this algorithm. The specific amount of time to carry out a given instruction will vary depending on which instruction is being executed and which computer is executing it, but on a conventional computer, this amount will be deterministic
Deterministic system (mathematics)

In mathematics, a deterministic system is a system in which no randomness is involved in the development of future states of the system. Deterministic mathematical model thus produce the same output for a given starting condition....
. Say that the actions carried out in step 1 are considered to consume time T1, step 2 uses time T2, and so forth.

In the algorithm above, steps 1, 2 and 7 will only be run once. For a worst-case evaluation, it should be assumed that step 3 will be run as well. Thus the total amount of time to run steps 1-3 and step 7 is:

The loops in steps 4, 5 and 6 are trickier to evaluate. The outer loop test in step 4 will execute ( n + 1 ) times (note that an extra step is required to terminate the for loop, hence n + 1 and not n executions), which will consume T4( n + 1 ) time. The inner loop, on the other hand, is governed by the value of i, which iterates
Iteration

Iteration means the act of repeating....
 from 1 to n. On the first pass through the outer loop, j iterates from 1 to 1: The inner loop makes one pass, so running the inner loop body (step 6) consumes T6 time, and the inner loop test (step 5) consumes 2T5 time. During the next pass through the outer loop, j iterates from 1 to 2: the inner loop makes two passes, so running the inner loop body (step 6) consumes 2T6 time, and the inner loop test (step 5) consumes 3T5 time.

Altogether, the total time required to run the inner loop body can be expressed as an arithmetic progression
Arithmetic progression

In mathematics, an arithmetic progression or arithmetic sequence is a sequence of numbers such that the difference of any two successive members of the sequence is a constant....
:

which can be factored
Factorization

In mathematics, factorization or factoring is the decomposition of an object into a product of other objects, or factors, which when multiplication together give the original....
 as

The total time required to run the inner loop test can be evaluated similarly:




which can be factored as

Therefore the total running time for this algorithm is:

which reduces
Reduction (mathematics)

In mathematics, reduction refers to the rewriting of an expression into a simpler form. For example, the process of rewriting a Fraction into one with the smallest whole-number denominator possible is called "reducing a fraction"....
 to

As a rule-of-thumb, one can assume that the highest-order term in any given function dominates its rate of growth and thus defines its run-time order. In this example, nē is the highest-order term, so one can conclude that f(n) = O(nē). Formally this can be proven as follows:
Prove that

(for n = 0)

Let k be a constant greater than or equal to [T1..T7]


(for n = 1)

Therefore for


A more elegant
Elegance

Elegance is the attribute of being unusually effective and simple. It is frequently used as a standard of tastefulness, particularly in the areas of visual design and beauty....
 approach to analyzing this algorithm would be to declare that [T1..T7] are all equal to one unit
Units of measurement

The definition, agreement and practical use of units of measurement have played a crucial role in human endeavour from early ages up to this day....
 of time greater than or equal to [T1..T7]. This would mean that the algorithm's running time breaks down as follows:
(for n = 1)


Growth rate analysis of other resources

The methodology of run-time analysis can also be utilized for predicting other growth rates, such as consumption of memory space
DSPACE

In computational complexity theory, DSPACE or SPACE is the computational resource describing the resource of memory space for a deterministic Turing machine....
. As an example, consider the following pseudocode which manages and reallocates memory usage by a program based on the size of a file
Computer file

A computer file is a block of arbitrary information, or resource for storing information, which is available to a computer program and is usually based on some kind of durable computer storage....
 which that program manages:

1 While file is still open 2Let n = file size 3For every 100,000 kilobyte
Kilobyte

Kilobyte is a unit of Computer data storage equal to either 1,024 bytes or 1,000 bytes , depending on context.It is abbreviated in a number of ways: KB, kB, K and Kbyte....
s of increase in file size 4 Double the amount of memory reserved

In this instance, as the file size n increases, memory will be consumed at an exponential growth
Exponential growth

Exponential growth occurs when the growth rate of a mathematical function is proportionality to the function's current value. In the case of a discrete domain of definition with equal intervals it is also called geometric growth or geometric decay ....
 rate, which is order O(2n).

See also

  • Algorithm
    Algorithm

    In mathematics, computing, linguistics and related subjects, an algorithm is a sequence of finite instructions, often used for calculation and data processing....
    s
  • Algorithmic efficiency
    Algorithmic efficiency

    In computer science, efficiency is used to describe properties of an algorithm relating to how much of various types of resources it consumes. The two most frequently encountered are...
    , optimizing an algorithm for speed, memory or both
  • Asymptotic analysis
    Asymptotic analysis

    In pure mathematics and applied mathematics, particularly the analysis of algorithms, real analysis, and engineering, asymptotic analysis is a method of describing Limit ing behaviour....
  • Big O notation
    Big O notation

    In mathematics, big O notation describes the asymptotic analysis of a function when the argument tends towards a particular value or infinity, usually in terms of simpler functions....
  • Best, worst and average case
    Best, worst and average case

    In computer science, best, worst and average cases of a given algorithm express what the Resource usage is at least, at most and on average, respectively....
  • Computational complexity theory
    Computational complexity theory

    Computational complexity theory, as a branch of the theory of computation in computer science, investigates the problems related to the Computational resource required for the execution of algorithms , and the inherent difficulty in providing efficient algorithms for specific computational problems....
  • Performance analysis
    Performance analysis

    In software engineering, performance analysis, more commonly today known as profiling, is the investigation of a program's behavior using information gathered as the program executes ....
    , measuring an algorithm's actual run-time performance
  • Polynomial time
    Polynomial time

    In computational complexity theory, polynomial time refers to the computation time of a problem where the run time, , is no greater than a polynomial function of the problem size, n....
  • NP-Complete
    NP-complete

    In computational complexity theory, the complexity class NP-complete is a class of problems having two properties:* Any given solution to the problem can be verified quickly ; the set of problems with this property is called NP ....
  • Donald Knuth
    Donald Knuth

    Donald Ervin Knuth is a renowned computer science and Emeritus of the Art of Computer Programming at Stanford University.Author of the seminal multi-volume work The Art of Computer Programming , Knuth has been called the "father" of the run-time analysis, contributing to the development of, and systematizing formal mathematical techn...