Integer factorization - AbsoluteAstronomy.com

Number theory

Number theory is a branch of pure mathematics devoted primarily to the study of the integers. Number theorists study prime numbers as well...

, integer factorization or prime factorization is the decomposition of a composite number

Composite number

A composite number is a positive integer which has a positive divisor other than one or itself. In other words a composite number is any positive integer greater than one that is not a prime number....

into smaller non-trivial divisor

Divisor

In mathematics, a divisor of an integer n, also called a factor of n, is an integer which divides n without leaving a remainder.-Explanation:...

s, which when multiplied together equal the original integer.

When the numbers are very large, no efficient integer factorization

Factorization

In mathematics, factorization or factoring is the decomposition of an object into a product of other objects, or factors, which when multiplied together give the original...

algorithm

Algorithm

In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...

is known; an effort concluded in 2009 by several researchers factored a 232-digit number (RSA-768), utilizing hundreds of machines over a span of 2 years. The presumed difficulty of this problem is at the heart of certain algorithms in cryptography

Cryptography

Cryptography is the practice and study of techniques for secure communication in the presence of third parties...

such as RSA. Many areas of mathematics

Mathematics

Mathematics is the study of quantity, space, structure, and change. Mathematicians seek out patterns and formulate new conjectures. Mathematicians resolve the truth or falsity of conjectures by mathematical proofs, which are arguments sufficient to convince other mathematicians of their validity...

and computer science

Computer science

Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

have been brought to bear on the problem, including elliptic curves, algebraic number theory

Algebraic number theory

Algebraic number theory is a major branch of number theory which studies algebraic structures related to algebraic integers. This is generally accomplished by considering a ring of algebraic integers O in an algebraic number field K/Q, and studying their algebraic properties such as factorization,...

, and quantum computing

Quantum computer

A quantum computer is a device for computation that makes direct use of quantum mechanical phenomena, such as superposition and entanglement, to perform operations on data. Quantum computers are different from traditional computers based on transistors...

.

Not all numbers of a given length are equally hard to factor. The hardest instances of these problems (for currently known techniques) are semiprime

Semiprime

In mathematics, a semiprime is a natural number that is the product of two prime numbers. The first few semiprimes are 4, 6, 9, 10, 14, 15, 21, 22, 25, 26, ... ....

s, the product of two prime number

Prime number

A prime number is a natural number greater than 1 that has no positive divisors other than 1 and itself. A natural number greater than 1 that is not a prime number is called a composite number. For example 5 is prime, as only 1 and 5 divide it, whereas 6 is composite, since it has the divisors 2...

s. When they are both large, randomly chosen, and about the same size (but not too close, e.g. to avoid efficient factorization by Fermat's factorization method

Fermat's factorization method

Fermat's factorization method, named after Pierre de Fermat, is based on the representation of an odd integer as the difference of two squares:N = a^2 - b^2.\...

), even the fastest prime factorization algorithms on the fastest computers can take enough time to make the search impractical.

Many cryptographic protocols are based on the difficulty of factoring large composite integers or a related problem, the RSA problem

RSA problem

In cryptography, the RSA problem summarizes the task of performing an RSA private-key operation given only the public key. The RSA algorithm raises a message to an exponent, modulo a composite number N whose factors are not known. As such, the task can be neatly described as finding the eth roots...

. An algorithm that efficiently factors an arbitrary integer would render RSA-based public-key cryptography insecure.

Prime decomposition

By the fundamental theorem of arithmetic

Fundamental theorem of arithmetic

In number theory, the fundamental theorem of arithmetic states that any integer greater than 1 can be written as a unique product of prime numbers...

, every positive integer has a unique prime factorization. (A special case for 1 is not needed using an appropriate notion of the empty product

Empty product

In mathematics, an empty product, or nullary product, is the result of multiplying no factors. It is equal to the multiplicative identity 1, given that it exists for the multiplication operation in question, just as the empty sum—the result of adding no numbers—is zero, or the additive...

.) However, the fundamental theorem of arithmetic gives no insight into how to obtain an integer's prime factorization; it only guarantees its existence.

Given a general algorithm for integer factorization, one can factor any integer down to its constituent prime factor

Prime factor

In number theory, the prime factors of a positive integer are the prime numbers that divide that integer exactly, without leaving a remainder. The process of finding these numbers is called integer factorization, or prime factorization. A prime factor can be visualized by understanding Euclid's...

s by repeated application of this algorithm. However, this is not the case with a special-purpose factorization algorithm, since it may not apply to the smaller factors that occur during decomposition, or may execute very slowly on these values. For example, trial division

Trial division

Trial division is the most laborious but easiest to understand of the integer factorization algorithms. Its ease of implementation makes it a viable integer factorization option for devices with little available memory, such as graphing calculators....

will quickly factor 2 × (2⁵²¹ − 1) × (2⁶⁰⁷ − 1), but will not quickly factor the resulting factors.

Current state of the art

The most difficult integers to factor in practice using existing algorithms are those that are products of two large primes of similar size, and for this reason these are the integers used in cryptographic applications. The largest such semiprime yet factored was RSA-768, a 768-bit number with 232 decimal digits, on December 12, 2009. This factorization was a collaboration of several research institutions, spanning two years and taking the equivalent of almost 2000 years of computing on a single core 2.2 GHz AMD

Advanced Micro Devices

Advanced Micro Devices, Inc. or AMD is an American multinational semiconductor company based in Sunnyvale, California, that develops computer processors and related technologies for commercial and consumer markets...

Opteron

Opteron is AMD's x86 server and workstation processor line, and was the first processor which supported the AMD64 instruction set architecture . It was released on April 22, 2003 with the SledgeHammer core and was intended to compete in the server and workstation markets, particularly in the same...

. Like all recent factorization records, this factorization was completed with a highly-optimized implementation of the general number field sieve

General number field sieve

In number theory, the general number field sieve is the most efficient classical algorithm known for factoring integers larger than 100 digits...

run on hundreds of machines.

Difficulty and complexity

If a large, b-bit

Bit

A bit is the basic unit of information in computing and telecommunications; it is the amount of information stored by a digital device or other physical system that exists in one of two possible distinct states...

number is the product of two primes that are roughly the same size, then no algorithm

Algorithm

has been published that can factor in polynomial time, i.e., that can factor it in time O

Big O notation

In mathematics, big O notation is used to describe the limiting behavior of a function when the argument tends towards a particular value or infinity, usually in terms of simpler functions. It is a member of a larger family of notations that is called Landau notation, Bachmann-Landau notation, or...

(b^k) for some constant k. There are published algorithms that are faster than O

Big O notation

((1+ε)^b) for all positive ε, i.e., sub-exponential.

The best published asymptotic running time is for the general number field sieve

General number field sieve

In number theory, the general number field sieve is the most efficient classical algorithm known for factoring integers larger than 100 digits...

(GNFS) algorithm, which, for a b-bit number n, is:

For an ordinary computer, GNFS is the best published algorithm for large n (more than about 100 digits). For a quantum computer

Quantum computer

, however, Peter Shor

Peter Shor

Peter Williston Shor is an American professor of applied mathematics at MIT, most famous for his work on quantum computation, in particular for devising Shor's algorithm, a quantum algorithm for factoring exponentially faster than the best currently-known algorithm running on a classical...

discovered an algorithm in 1994 that solves it in polynomial time. This will have significant implications for cryptography if a large quantum computer is ever built. Shor's algorithm

Shor's algorithm

Shor's algorithm, named after mathematician Peter Shor, is a quantum algorithm for integer factorization formulated in 1994...

takes only O(b³) time and O(b) space on b-bit number inputs. In 2001, the first 7-qubit quantum computer became the first to run Shor's algorithm. It factored the number 15.

When discussing what complexity class

Complexity class

In computational complexity theory, a complexity class is a set of problems of related resource-based complexity. A typical complexity class has a definition of the form:...

es the integer factorization problem falls into, it's necessary to distinguish two slightly different versions of the problem:

The function problem
Function problem
In computational complexity theory, a function problem is a computational problem where a single output is expected for every input, but the output is more complex than that of a decision problem, that is, it isn't just YES or NO...

version: given an integer N, find an integer d with 1 < d < N that divides N (or conclude that N is prime). This problem is trivially in FNP
FNP (complexity)
In computational complexity theory, the complexity class FNP is the function problem extension of the decision problem class NP. The name is somewhat of a misnomer, since technically it is a class of binary relations, not functions, as the following formal definition explains:This definition does...

and it's not known whether it lies in FP
FP (complexity)
In computational complexity theory, the complexity class FP is the set of function problems which can be solved by a deterministic Turing machine in polynomial time; it is the function problem version of the decision problem class P...

or not. This is the version solved by most practical implementations.
The decision problem
Decision problem
In computability theory and computational complexity theory, a decision problem is a question in some formal system with a yes-or-no answer, depending on the values of some input parameters. For example, the problem "given two numbers x and y, does x evenly divide y?" is a decision problem...

version: given an integer N and an integer M with 1 ≤ M ≤ N, does N have a factor d with 1 < d < M? This version is useful because most well-studied complexity classes are defined as classes of decision problems, not function problems. This is a natural decision version of the problem, analogous to those frequently used for optimization problem
Optimization problem
In mathematics and computer science, an optimization problem is the problem of finding the best solution from all feasible solutions. Optimization problems can be divided into two categories depending on whether the variables are continuous or discrete. An optimization problem with discrete...

s, because it can be combined with binary search to solve the function problem version in a logarithmic number of queries.

It is not known exactly which complexity classes

Computational complexity theory

Computational complexity theory is a branch of the theory of computation in theoretical computer science and mathematics that focuses on classifying computational problems according to their inherent difficulty, and relating those classes to each other...

contain the decision version of the integer factorization problem. It is known to be in both NP

NP (complexity)

In computational complexity theory, NP is one of the most fundamental complexity classes.The abbreviation NP refers to "nondeterministic polynomial time."...

and co-NP. This is because both YES and NO answers can be verified in polynomial time given the prime factors (we can verify their primality using the AKS primality test

AKS primality test

The AKS primality test is a deterministic primality-proving algorithm created and published by three Indian Institute of Technology Kanpur computer scientists, Manindra Agrawal, Neeraj Kayal, and Nitin Saxena, on August 6, 2002, in a paper titled "PRIMES is in P"...

, and that their product is N by multiplication). The fundamental theorem of arithmetic

Fundamental theorem of arithmetic

In number theory, the fundamental theorem of arithmetic states that any integer greater than 1 can be written as a unique product of prime numbers...

guarantees that there is only one possible string that will be accepted (providing the factors are required to be listed in order), which shows that the problem is in both UP
UP (complexity)
In complexity theory, UP is the complexity class of decision problems solvable in polynomial time on a non-deterministic Turing machine with at most one accepting path for each input...

and co-UP. It is known to be in BQP

BQP

In computational complexity theory BQP is the class of decision problems solvable by a quantum computer in polynomial time, with an error probability of at most 1/3 for all instances...

because of Shor's algorithm

Shor's algorithm

Shor's algorithm, named after mathematician Peter Shor, is a quantum algorithm for integer factorization formulated in 1994...

. It is suspected to be outside of all three of the complexity classes P

P (complexity)

In computational complexity theory, P, also known as PTIME or DTIME, is one of the most fundamental complexity classes. It contains all decision problems which can be solved by a deterministic Turing machine using a polynomial amount of computation time, or polynomial time.Cobham's thesis holds...

, NP-complete

NP-complete

In computational complexity theory, the complexity class NP-complete is a class of decision problems. A decision problem L is NP-complete if it is in the set of NP problems so that any given solution to the decision problem can be verified in polynomial time, and also in the set of NP-hard...

, and co-NP-complete

Co-NP-complete

In complexity theory, computational problems that are co-NP-complete are those that are the hardest problems in co-NP, in the sense that they are the ones most likely not to be in P...

. It is therefore a candidate for the NP-intermediate complexity class. If it could be proved that it is in either NP-Complete or co-NP-Complete, that would imply NP = co-NP. That would be a very surprising result, and therefore integer factorization is widely suspected to be outside both of those classes. Many people have tried to find classical polynomial-time algorithms for it and failed, and therefore it is widely suspected to be outside P.

In contrast, the decision problem "is N a composite number

Composite number

?" (or equivalently: "is N a prime number

Prime number

?") appears to be much easier than the problem of actually finding the factors of N. Specifically, the former can be solved in polynomial time (in the number n of digits of N) with the AKS primality test

AKS primality test

. In addition, there are a number of probabilistic algorithm

Randomized algorithm

A randomized algorithm is an algorithm which employs a degree of randomness as part of its logic. The algorithm typically uses uniformly random bits as an auxiliary input to guide its behavior, in the hope of achieving good performance in the "average case" over all possible choices of random bits...

s that can test primality very quickly in practice if one is willing to accept the vanishingly small possibility of error. The ease of primality test

Primality test

A primality test is an algorithm for determining whether an input number is prime. Amongst other fields of mathematics, it is used for cryptography. Unlike integer factorization, primality tests do not generally give prime factors, only stating whether the input number is prime or not...

ing is a crucial part of the RSA algorithm, as it is necessary to find large prime numbers to start with.

Special-purpose

A special-purpose factoring algorithm's running time depends on the properties of the number to be factored or on one of its unknown factors: size, special form, etc. Exactly what the running time depends on varies between algorithms. For example, trial division is considered special purpose because the running time is roughly proportional to the size of the smallest factor.

Trial division
Trial division
Trial division is the most laborious but easiest to understand of the integer factorization algorithms. Its ease of implementation makes it a viable integer factorization option for devices with little available memory, such as graphing calculators....
Wheel factorization
Wheel factorization
Wheel factorization is a graphical method for manually performing a preliminary to the Sieve of Eratosthenes that separates prime numbers from composites. Start by writing the natural numbers around circles as shown below. Prime numbers in the innermost circle have their multiples in similar...
Pollard's rho algorithm
Pollard's rho algorithm
Pollard's rho algorithm is a special-purpose integer factorization algorithm. It was invented by John Pollard in 1975. It is particularly effective at splitting composite numbers with small factors.-Core ideas:...
Algebraic-group factorisation algorithms, among which are Pollard's p − 1 algorithm, Williams' p + 1 algorithm, and Lenstra elliptic curve factorization
Lenstra elliptic curve factorization
The Lenstra elliptic curve factorization or the elliptic curve factorization method is a fast, sub-exponential running time algorithm for integer factorization which employs elliptic curves. For general purpose factoring, ECM is the third-fastest known factoring method...
Fermat's factorization method
Fermat's factorization method
Fermat's factorization method, named after Pierre de Fermat, is based on the representation of an odd integer as the difference of two squares:N = a^2 - b^2.\...
Euler's factorization method
Euler's factorization method
Euler's factorization method is a technique for factoring a number by writing it as a sum of two squares in two different ways. For example the number 1000009 can be written as 1000^2 + 3^2 or as 972^2 + 235^2 and Euler's method gives the factorization 1000009 = 293 \cdot 3413.The idea that two...
Special number field sieve
Special number field sieve
In number theory, a branch of mathematics, the special number field sieve is a special-purpose integer factorization algorithm. The general number field sieve was derived from it....

General-purpose

A general-purpose factoring algorithm's running time depends solely on the size of the integer to be factored. This is the type of algorithm used to factor RSA numbers. Most general-purpose factoring algorithms are based on the congruence of squares

Congruence of squares

In number theory, a congruence of squares is a congruence commonly used in integer factorization algorithms.-Derivation:Given a positive integer n, Fermat's factorization method relies on finding numbers x, y satisfying the equality...

method.

Dixon's algorithm
Continued fraction factorization
Continued fraction factorization
In number theory, the continued fraction factorization method is an integer factorization algorithm. It is a general-purpose algorithm, meaning that it is suitable for factoring any integer n, not depending on special form or properties. It was described by D. H. Lehmer and R. E. Powers in 1931,...

(CFRAC)
Quadratic sieve
Quadratic sieve
The quadratic sieve algorithm is a modern integer factorization algorithm and, in practice, the second fastest method known . It is still the fastest for integers under 100 decimal digits or so, and is considerably simpler than the number field sieve...
General number field sieve
General number field sieve
In number theory, the general number field sieve is the most efficient classical algorithm known for factoring integers larger than 100 digits...
Shanks' square forms factorization
Shanks' square forms factorization
Shanks' square forms factorization is a method for integer factorization devised by Daniel Shanks as an improvement on Fermat's factorization method....

(SQUFOF)

Heuristic running time

In number theory

Number theory

Number theory is a branch of pure mathematics devoted primarily to the study of the integers. Number theorists study prime numbers as well...

, there are many integer factoring

Integer factorization

In number theory, integer factorization or prime factorization is the decomposition of a composite number into smaller non-trivial divisors, which when multiplied together equal the original integer....

algorithms that heuristically have expected running time

Time complexity

In computer science, the time complexity of an algorithm quantifies the amount of time taken by an algorithm to run as a function of the size of the input to the problem. The time complexity of an algorithm is commonly expressed using big O notation, which suppresses multiplicative constants and...

in o

Big O notation

and L-notation

L-notation

L-notation is an asymptotic notation analogous to big-O notation, denoted as L_n[\alpha,c] for a bound variable n tending to infinity. Like big-O notation, it is usually used to roughly convey the computational complexity of a particular algorithm....

.
Some examples of those algorithms are the elliptic curve method and the quadratic sieve

Quadratic sieve

The quadratic sieve algorithm is a modern integer factorization algorithm and, in practice, the second fastest method known . It is still the fastest for integers under 100 decimal digits or so, and is considerably simpler than the number field sieve...

.
Another such algorithm is the class group relations method proposed by Schnorr, Seysen, and Lenstra that is proved under of the Generalized Riemann Hypothesis (GRH)

Generalized Riemann hypothesis

The Riemann hypothesis is one of the most important conjectures in mathematics. It is a statement about the zeros of the Riemann zeta function. Various geometrical and arithmetical objects can be described by so-called global L-functions, which are formally similar to the Riemann zeta-function...

Rigorous running time

The Schnorr-Seysen-Lenstra probabilistic algorithm has been rigorously proven by Lenstra and Pomerance to have expected running time

by replacing the GRH assumption with the use of multipliers.
The algorithm uses the class group

Ideal class group

In mathematics, the extent to which unique factorization fails in the ring of integers of an algebraic number field can be described by a certain group known as an ideal class group...

of positive binary quadratic forms of discriminant Δ denoted by G_Δ.
G_Δ is the set of triples of integers (a, b, c) in which those integers are relative prime.

Schnorr-Seysen-Lenstra Algorithm

Given is an integer n that will be factored, where n is an odd positive integer greater than a certain constant. In this factoring algorithm the discriminant Δ is chosen as a multiple of n, Δ= -dn, where d is some positive multiplier. The algorithm expects that for one d there exist enough smooth

Smooth number

In number theory, a smooth number is an integer which factors completely into small prime numbers. The term seems to have been coined by Leonard Adleman. Smooth numbers are especially important in cryptography relying on factorization.-Definition:...

forms in G_Δ. Lenstra and Pomerance show that the choice of d can be restricted to a small set to guarantee the smoothness result.

Denote by P_Δ the set of all primes q with Kronecker symbol

Kronecker symbol

In number theory, the Kronecker symbol, written as \left or , is a generalization of the Jacobi symbol to all integers n. It was introduced by Leopold Kronecker.-Definition:...

. By constructing a set of generators

Generating set of a group

In abstract algebra, a generating set of a group is a subset that is not contained in any proper subgroup of the group. Equivalently, a generating set of a group is a subset such that every element of the group can be expressed as the combination of finitely many elements of the subset and their...

of G_Δ and prime forms f_q of G_Δ with q in P_Δ a sequence of relations between the set of generators and f_q are produced.
The size of q can be bounded by

for some constant

.

The relation that will be used is a relation between the product of powers that is equal to the neutral element

Group (mathematics)

In mathematics, a group is an algebraic structure consisting of a set together with an operation that combines any two of its elements to form a third element. To qualify as a group, the set and the operation must satisfy a few conditions called group axioms, namely closure, associativity, identity...

of G_Δ. These relations will be used to construct a so-called ambiguous form of G_Δ, which is an element of G_Δ of order dividing 2. By calculating the corresponding factorization of Δ and by taking a gcd

Greatest common divisor

In mathematics, the greatest common divisor , also known as the greatest common factor , or highest common factor , of two or more non-zero integers, is the largest positive integer that divides the numbers without a remainder.For example, the GCD of 8 and 12 is 4.This notion can be extended to...

, this ambiguous form provides the complete prime factorization of n. This algorithm has these main steps:

Let n be the number to be factored.

Let Δ be a negative integer with Δ = -dn, where d is a multiplier and Δ is the negative discriminant of some quadratic form.
Take the t first primes , for some .
Let be a random prime form of G_Δ with .
Find a generating set X of G_Δ
Collect a sequence of relations between set X and {f_q : q ∈ P_Δ} satisfying:
Construct an ambiguous form (a, b, c) that is an element f ∈ G_Δ of order dividing 2 to obtain a coprime factorization of the largest odd divisor of Δ in which Δ = -4a.c or a(a - 4c) or (b - 2a).(b + 2a)
If the ambiguous form provides a factorization of n then stop, otherwise find another ambiguous form until the factorization of n is found. In order to prevent useless ambiguous forms from generating, build up the 2-Sylow group S₂(Δ) of G(Δ).

To obtain an algorithm for factoring any positive integer, it is necessary to add a few steps to this algorithm such as trial division

Trial division

, Jacobi sum test

Adleman–Pomerance–Rumely primality test

In computational number theory, the Adleman–Pomerance–Rumely primality test is an algorithm for determining whether a number is prime. Unlike other, more efficient algorithms for this purpose, it avoids the use of random numbers, so it is a deterministic primality test. It is named after its...

Expected running time

The algorithm as stated is a probabilistic algorithm as it makes random choices. Its expected running time is at most

External links

Video explaining uniqueness of prime factorization using a lock analogy.
A collection of links to factoring programs
Richard P. Brent, "Recent Progress and Prospects for Integer Factorisation Algorithms", Computing and Combinatorics", 2000, pp. 3-22. download
Manindra Agrawal
Manindra Agrawal
Manindra Agrawal is a professor at the department of computer science and engineering and the Dean of Resource, Planning and Generation at the Indian Institute of Technology, Kanpur. He is also the recipient of the first Infosys Prize for Mathematics.-Early life:Manindra Agrawal obtained a...

, Neeraj Kayal, Nitin Saxena, "PRIMES is in P." Annals of Mathematics 160(2): 781-793 (2004). August 2005 version PDF
[ftp://ftp.computing.dcu.ie/pub/crypto/factor.exe] is a public-domain integer factorization program for Windows. It claims to handle 80-digit numbers. See also the web site for this program MIRACL
The RSA Challenge Numbers - a factoring challenge, no longer active.
Eric W. Weisstein, “RSA-640 Factored” MathWorld Headline News, November 8, 2005
Qsieve, a suite of programs for integer factorization. It contains several factorization methods like Elliptic Curve Method and MPQS.
Source code by Paolo Ardoino, Three known algorithms and C source code.
Factorization Source Code: by Paul Herman & Ami Fischman, C++ source code for many factorization algorithms including Pollard Rho & Shor's.
a database containing factorizations, complete and incomplete, of over 80 million numbers.

The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.