Hopcroft-Karp algorithm - AbsoluteAstronomy.com

Computer science

Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

, the Hopcroft–Karp algorithm is an algorithm

Algorithm

In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...

that takes as input a bipartite graph

Bipartite graph

In the mathematical field of graph theory, a bipartite graph is a graph whose vertices can be divided into two disjoint sets U and V such that every edge connects a vertex in U to one in V; that is, U and V are independent sets...

and produces as output a maximum cardinality matching – a set of as many edges as possible with the property that no two edges share an endpoint. It runs in O(m√n) time in the worst case, where "O" refers to big O notation

Big O notation

In mathematics, big O notation is used to describe the limiting behavior of a function when the argument tends towards a particular value or infinity, usually in terms of simpler functions. It is a member of a larger family of notations that is called Landau notation, Bachmann-Landau notation, or...

, m is the number of edges in the graph, and n is the number of vertices of the graph. In the case of dense graph

Dense graph

In mathematics, a dense graph is a graph in which the number of edges is close to the maximal number of edges. The opposite, a graph with only a few edges, is a sparse graph...

s the time bound becomes O(n^5/2), and for random graph

Random graph

In mathematics, a random graph is a graph that is generated by some random process. The theory of random graphs lies at the intersection between graph theory and probability theory, and studies the properties of typical random graphs.-Random graph models:...

s it runs in near-linear time.

The algorithm was found by . As in previous methods for matching such as the Hungarian algorithm

Hungarian algorithm

The Hungarian method is a combinatorial optimization algorithm which solves the assignment problem in polynomial time and which anticipated later primal-dual methods...

and the work of , the Hopcroft–Karp algorithm repeatedly increases the size of a partial matching by finding augmenting paths. However, instead of finding just a single augmenting path per iteration, the algorithm finds a maximal set of shortest augmenting paths. As a result only O(√n) iterations are needed. The same principle has also been used to develop more complicated algorithms for non-bipartite matching with the same asymptotic running time as the Hopcroft–Karp algorithm.

Augmenting paths

A vertex that is not the endpoint of an edge in some partial matching M is called a free vertex. The basic concept that the algorithm relies on is that of an augmenting path, a path that starts at a free vertex, ends at a free vertex, and alternates between unmatched and matched edges within the path. If M is a matching of size n, and P is an augmenting path relative to M, then the symmetric difference

Symmetric difference

In mathematics, the symmetric difference of two sets is the set of elements which are in either of the sets and not in their intersection. The symmetric difference of the sets A and B is commonly denoted by A\,\Delta\,B\,orA \ominus B....

of the two sets of edges, M ⊕ P, would form a matching with size n + 1. Thus, by finding augmenting paths, an algorithm may increase the size of the matching.

Conversely, suppose that a matching M is not optimal, and let P be the symmetric difference M ⊕ M* where M* is an optimal matching. Then P must form a collection of disjoint cycles and augmenting paths; the difference in size between M and M* is the number of paths in P. Thus, if no augmenting path can be found, an algorithm may safely terminate, since in this case M must be optimal.

An augmenting path in a matching problem is closely related to the augmenting paths arising in maximum flow problem

Maximum flow problem

In optimization theory, the maximum flow problem is to find a feasible flow through a single-source, single-sink flow network that is maximum....

s, paths along which one may increase the amount of flow between the terminals of the flow. It is possible to transform the bipartite matching problem into a maximum flow instance, such that the alternating paths of the matching problem become augmenting paths of the flow problem. In fact, a generalization of the technique used in Hopcroft-Karp algorithm to arbitrary flow networks is known as Dinic's algorithm

Dinic's algorithm

Dinic's algorithm is a strongly polynomial algorithm for computing the maximum flow in a flow network, conceived in 1970 by Israeli computer scientist Yefim Dinitz. The algorithm runs in O time and is similar to the Edmonds–Karp algorithm, which runs in O time, in that it uses shortest augmenting...

Algorithm

Let U and V be the two sets in the bipartition of G, and let the matching from U to V at any time be represented as the set M.

The algorithm is run in phases. Each phase consists of the following steps.

A breadth-first search
Breadth-first search
In graph theory, breadth-first search is a graph search algorithm that begins at the root node and explores all the neighboring nodes...

partitions the vertices of the graph into layers. The free vertices in U are used as the starting vertices of this search, and form the first layer of the partition. At the first level of the search, only unmatched edges may be traversed (since the free vertices in U are by definition not adjacent to any matched edges); at subsequent levels of the search, the traversed edges are required to alternate between unmatched and matched. That is, when searching for successors from a vertex in U, only unmatched edges may be traversed, while from a vertex in V only matched edges may be traversed. The search terminates at the first layer k where one or more free vertices in V are reached.
All free vertices in V at layer k are collected into a set F. That is, a vertex v is put into F if and only if it ends a shortest augmenting path.
The algorithm finds a maximal set of vertex disjoint augmenting paths of length k. This set may be computed by depth first search from F to the free vertices in U, using the breadth first layering to guide the search: the depth first search is only allowed to follow edges that lead to an unused vertex in the previous layer, and paths in the depth first search tree must alternate between unmatched and matched edges. Once an augmenting path is found that involves one of the vertices in F, the depth first search is continued from the next starting vertex.
Every one of the paths found in this way is used to enlarge M.

The algorithm terminates when no more augmenting paths are found in the breadth first search part of one of the phases.

Input: Bipartite graph

Output: Matching

repeat

maximal set of vertex-disjoint shortest augmenting paths

until

Analysis

Each phase consists of a single breadth first search and a single depth first search. Thus, a single phase may be implemented in linear time.
Therefore, the first √n phases, in a graph with n vertices and m edges, take time O(m √n).

It can be shown that each phase increases the length of the shortest augmenting path by at least one: the phase finds a maximal set of augmenting paths of the given length, so any remaining augmenting path must be longer. Therefore, once the initial √n phases of the algorithm are complete, the shortest remaining augmenting path has at least √n edges in it. However, the symmetric difference

Symmetric difference

of the eventual optimal matching and of the partial matching M found by the initial phases forms a collection of vertex-disjoint augmenting paths and alternating cycles. If each of the paths in this collection has length at least √n, there can be at most √n paths in the collection, and the size of the optimal matching can differ from the size of M by at most √n edges. Since each phase of the algorithm increases the size of the matching by at least one, there can be at most √n additional phases before the algorithm terminates.

Since the algorithm performs a total of at most 2√n phases, it takes a total time of O(m √n) in the worst case.

In many instances, however, the time taken by the algorithm may be even faster than this worst case analysis indicates. For instance, in the average case for sparse bipartite random graph

Random graph

s, (improving a previous result of ) showed that with high probability all non-optimal matchings have augmenting paths of logarithm

Logarithm

The logarithm of a number is the exponent by which another fixed value, the base, has to be raised to produce that number. For example, the logarithm of 1000 to base 10 is 3, because 1000 is 10 to the power 3: More generally, if x = by, then y is the logarithm of x to base b, and is written...

ic length. As a consequence, for these graphs, the Hopcroft–Karp algorithm takes O(log n) phases and O(m log n) total time.

Comparison with other bipartite matching algorithms

For sparse graphs, the Hopcroft–Karp algorithm continues to have the best known worst-case performance, but for dense graphs a more recent algorithm by achieves a slightly better time bound, O(n^1.5(m/log n)^0.5). Their algorithm is based on using a push-relabel maximum flow algorithm and then, when the matching created by this algorithm becomes close to optimum, switching to the Hopcroft–Karp method.

Several authors have performed experimental comparisons of bipartite matching algorithms. Their results in general tend to show that the Hopcroft–Karp method is not as good in practice as it is in theory: it is outperformed both by simpler breadth-first and depth-first strategies for finding augmenting paths, and by push-relabel techniques.

Non-bipartite graphs

The same idea of finding a maximal set of shortest augmenting paths works also for finding maximum cardinality matchings in non-bipartite graphs, and for the same reasons the algorithms based on this idea take O(√n) phases. However, for non-bipartite graphs, the task of finding the augmenting paths within each phase is more difficult. Building on the work of several slower predecessors, showed how to implement a phase in linear time, resulting in a non-bipartite matching algorithm with the same time bound as the Hopcroft–Karp algorithm for bipartite graphs. The Micali–Vazirani technique is complex, and its authors did not provide full proofs of their results; alternative methods for this problem were later described by other authors.

Pseudocode

1 /*
2 G = G₁

G₂

{NIL}
3 where G₁ and G₂ are partition of graph and NIL is a special null vertex
4 */
5
6 function BFS
7 for v in G₁
8 if Pair[v]

NIL
9 Dist[v] = 0
10 Enqueue(Q,v)
11 else
12 Dist[v] =
13 Dist[NIL] =
14 while Empty(Q)

false
15 v = Dequeue(Q)
16 for each u in Adj[v]
17 if Dist[ Pair[u] ]

18 Dist[ Pair[u] ] = Dist[v] + 1
29 Enqueue(Q,Pair[u])
20 return Dist[NIL] !=

21
22 function DFS (v)
23 if v != NIL
24 for each u in Adj[v]
25 if Dist[ Pair[u] ] Dist[v] + 1
26 if DFS(Pair[u]) true
27 Pair[u] = v
28 Pair[v] = u
39 return true
30 Dist[v] =

31 return false
32 return true
33
34 function Hopcroft-Karp
35 for each v in G
36 Pair[v] = NIL
37 matching = 0
38 while BFS true
49 for each v in G₁
40 if Pair[v]

NIL
41 if DFS(v)

true
42 matching = matching + 1
44 return matching

The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.

Augmenting paths

Algorithm

Analysis

Comparison with other bipartite matching algorithms

Non-bipartite graphs

Pseudocode

NIL 9 Dist[v] = 0 10 Enqueue(Q,v) 11 else 12 Dist[v] = 13 Dist[NIL] = 14 while Empty(Q)

NIL 41 if DFS(v)

NIL
9 Dist[v] = 0
10 Enqueue(Q,v)
11 else
12 Dist[v] =
13 Dist[NIL] =
14 while Empty(Q)

NIL
41 if DFS(v)