Directed acyclic graph

# Directed acyclic graph

Overview
In mathematics
Mathematics
Mathematics is the study of quantity, space, structure, and change. Mathematicians seek out patterns and formulate new conjectures. Mathematicians resolve the truth or falsity of conjectures by mathematical proofs, which are arguments sufficient to convince other mathematicians of their validity...

and computer science
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

, a directed acyclic graph (DAG ), is a directed graph
Directed graph
A directed graph or digraph is a pair G= of:* a set V, whose elements are called vertices or nodes,...

with no directed cycles. That is, it is formed by a collection of vertices
Vertex (graph theory)
In graph theory, a vertex or node is the fundamental unit out of which graphs are formed: an undirected graph consists of a set of vertices and a set of edges , while a directed graph consists of a set of vertices and a set of arcs...

and directed edges, each edge connecting one vertex to another, such that there is no way to start at some vertex v and follow a sequence of edges that eventually loops back to v again.
Discussion
 Ask a question about 'Directed acyclic graph' Start a new discussion about 'Directed acyclic graph' Answer questions from other users Full Discussion Forum

Encyclopedia
In mathematics
Mathematics
Mathematics is the study of quantity, space, structure, and change. Mathematicians seek out patterns and formulate new conjectures. Mathematicians resolve the truth or falsity of conjectures by mathematical proofs, which are arguments sufficient to convince other mathematicians of their validity...

and computer science
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

, a directed acyclic graph (DAG ), is a directed graph
Directed graph
A directed graph or digraph is a pair G= of:* a set V, whose elements are called vertices or nodes,...

with no directed cycles. That is, it is formed by a collection of vertices
Vertex (graph theory)
In graph theory, a vertex or node is the fundamental unit out of which graphs are formed: an undirected graph consists of a set of vertices and a set of edges , while a directed graph consists of a set of vertices and a set of arcs...

and directed edges, each edge connecting one vertex to another, such that there is no way to start at some vertex v and follow a sequence of edges that eventually loops back to v again.

DAGs may be used to model several different kinds of structure in mathematics and computer science. A collection of tasks that must be ordered into a sequence, subject to constraints that certain tasks must be performed earlier than others, may be represented as a DAG with a vertex for each task and an edge for each constraint; algorithms for topological ordering may be used to generate a valid sequence. DAGs may also be used to model processes in which information flows in a consistent direction through a network of processors. The reachability
Reachability
In graph theory, reachability is the notion of being able to get from one vertex in a directed graph to some other vertex. Note that reachability in undirected graphs is trivial — it is sufficient to find the connected components in the graph, which can be done in linear time.- Definition :For a...

relation in a DAG forms a partial order, and any finite partial order may be represented by a DAG using reachability. Additionally, DAGs may be used as a space-efficient representation of a collection of sequences with overlapping subsequences.

The corresponding concept for undirected graphs is a forest, an undirected graph without cycles. Choosing an orientation for a forest produces a special kind of directed acyclic graph called a polytree
Polytree
In graph theory, a polytree is a directed graph with at most one undirected path between any two vertices. In other words, a polytree is a directed acyclic graph for which there are no undirected cycles either...

. However there are many other kinds of directed acyclic graph that are not formed by orienting the edges of an undirected acyclic graph. For this reason it may be more accurate to call directed acyclic graphs acyclic directed graphs or acyclic digraphs.

## Partial orders and topological ordering

Each directed acyclic graph gives rise to a partial order ≤ on its vertices, where uv exactly when there exists a directed path from u to v in the DAG. However, many different DAGs may give rise to this same reachability
Reachability
In graph theory, reachability is the notion of being able to get from one vertex in a directed graph to some other vertex. Note that reachability in undirected graphs is trivial — it is sufficient to find the connected components in the graph, which can be done in linear time.- Definition :For a...

relation: for example, the DAG with two edges ab and bc has the same reachability as the graph with three edges ab, bc, and ac. If G is a DAG, its transitive reduction
Transitive reduction
In mathematics, a transitive reduction of a binary relation R on a set X is a minimal relation R' on X such that the transitive closure of R' is the same as the transitive closure of R. If the transitive closure of R is antisymmetric and finite, then R' is unique...

is the graph with the fewest edges that represents the same reachability as G, and its transitive closure
Transitive closure
In mathematics, the transitive closure of a binary relation R on a set X is the transitive relation R+ on set X such that R+ contains R and R+ is minimal . If the binary relation itself is transitive, then the transitive closure will be that same binary relation; otherwise, the transitive closure...

is the graph with the most edges that represents the same reachability.

The transitive closure of G has an edge uv for every related pair u ≤ v of distinct elements in the reachability relation of G, and may therefore be thought of as a direct translation of the reachability relation ≤ into graph-theoretic terms: every partially ordered set may be translated into a DAG in this way. If a DAG G represents a partial order ≤, then the transitive reduction of G is a subgraph of G with an edge uv for every pair in the covering relation
Covering relation
In mathematics, especially order theory, the covering relation of a partially ordered set is the binary relation which holds between comparable elements that are immediate neighbours...

of ≤; transitive reductions are useful in visualizing the partial orders they represent, because they have fewer edges than other graphs representing the same orders and therefore lead to simpler graph drawing
Graph drawing
Graph drawing is an area of mathematics and computer science combining methods from geometric graph theory and information visualization to derive two-dimensional depictions of graphs arising from applications such as social network analysis, cartography, and bioinformatics...

s. A Hasse diagram
Hasse diagram
In order theory, a branch of mathematics, a Hasse diagram is a type of mathematical diagram used to represent a finite partially ordered set, in the form of a drawing of its transitive reduction...

of a partial order is a drawing of the transitive reduction in which the orientation of each edge is shown by placing the starting vertex of the edge in a lower position than its ending vertex.

Every directed acyclic graph has a topological ordering, an ordering of the vertices such that the starting endpoint of every edge occurs earlier in the ordering than the ending endpoint of the edge. In general, this ordering is not unique; a DAG has a unique topological ordering if and only if it has a directed path containing all the vertices, in which case the ordering is the same as the order in which the vertices appear in the path. The family of topological orderings of a DAG is the same as the family of linear extension
Linear extension
In order theory, a branch of mathematics, a linear extension of a partial order is a linear order that is compatible with the partial order.-Definitions:...

s of the reachability relation for the DAG, so any two graphs representing the same partial order have the same set of topological orders. Topological sorting
Topological sorting
In computer science, a topological sort or topological ordering of a directed graph is a linear ordering of its vertices such that, for every edge uv, u comes before v in the ordering...

is the algorithmic problem of finding topological orderings; it can be solved in linear time. It is also possible to check whether a given directed graph is a DAG in linear time, by attempting to find a topological ordering and then testing whether the resulting ordering is valid.

Some algorithms become simpler when used on DAGs instead of general graphs, based on the principle of topological ordering. For example, it is possible to find shortest paths and longest paths
Longest path problem
In the mathematical discipline of graph theory, the longest path problem is the problem of finding a simple path of maximum length in a given graph. A path is called simple if it does not have any repeated vertices...

from a given starting vertex in DAGs in linear time by processing the vertices in a topological order, and calculating the path length for each vertex to be the minimum or maximum length obtained via any of its incoming edges.
In contrast, for arbitrary graphs the shortest path may require slower algorithms such as Dijkstra's algorithm
Dijkstra's algorithm
Dijkstra's algorithm, conceived by Dutch computer scientist Edsger Dijkstra in 1956 and published in 1959, is a graph search algorithm that solves the single-source shortest path problem for a graph with nonnegative edge path costs, producing a shortest path tree...

or the Bellman-Ford algorithm
Bellman-Ford algorithm
The Bellman–Ford algorithm computes single-source shortest paths in a weighted digraph.For graphs with only non-negative edge weights, the faster Dijkstra's algorithm also solves the problem....

, and longest paths in arbitrary graphs are NP-hard
NP-hard
NP-hard , in computational complexity theory, is a class of problems that are, informally, "at least as hard as the hardest problems in NP". A problem H is NP-hard if and only if there is an NP-complete problem L that is polynomial time Turing-reducible to H...

to find.

DAG representations of partial orderings have many applications in scheduling problems
Job Shop Scheduling
Job shop scheduling is an optimization problem in computer science in which ideal jobs are assigned to resources at particular times. The most basic version is as follows:...

for systems of tasks with ordering constraints. For instance, a DAG may be used to describe the dependencies between cells of a spreadsheet
A spreadsheet is a computer application that simulates a paper accounting worksheet. It displays multiple cells usually in a two-dimensional matrix or grid consisting of rows and columns. Each cell contains alphanumeric text, numeric values or formulas...

: if one cell is computed by a formula involving the value of a second cell, draw a DAG edge from the second cell to the first one. If the input values to the spreadsheet change, all of the remaining values of the spreadsheet may be recomputed with a single evaluation per cell, by topologically ordering the cells and re-evaluating each cell in this order. Similar problems of task ordering arise in makefiles for program compilation, instruction scheduling
Instruction scheduling
In computer science, instruction scheduling is a compiler optimization used to improve instruction-level parallelism, which improves performance on machines with instruction pipelines...

for low-level computer program optimization, and PERT scheduling
Program Evaluation and Review Technique
The Program ' Evaluation and Review Technique, commonly abbreviated PERT, is a statistical tool, used in project management, that is designed to analyze and represent the tasks involved in completing a given project...

for management of large human projects. Dependency graph
Dependency graph
In mathematics, computer science and digital electronics, a dependency graph is a directed graph representing dependencies of several objects towards each other...

s without circular dependencies form directed acyclic graphs.

## Data processing networks

A directed graph may be used to represent a network of processing elements; in this formulation, data enters a processing element through its incoming edges and leaves the element through its outgoing edges. Examples of this include the following:
• In electronic circuit design, a combinational logic
Combinational logic
In digital circuit theory, combinational logic is a type of digital logic which is implemented by boolean circuits, where the output is a pure function of the present input only. This is in contrast to sequential logic, in which the output depends not only on the present input but also on the...

circuit is an acyclic system of logic gate
Logic gate
A logic gate is an idealized or physical device implementing a Boolean function, that is, it performs a logical operation on one or more logic inputs and produces a single logic output. Depending on the context, the term may refer to an ideal logic gate, one that has for instance zero rise time and...

s that computes a function of an input, where the input and output of the function are represented as individual bit
Bit
A bit is the basic unit of information in computing and telecommunications; it is the amount of information stored by a digital device or other physical system that exists in one of two possible distinct states...

s.
• A Bayesian network
Bayesian network
A Bayesian network, Bayes network, belief network or directed acyclic graphical model is a probabilistic graphical model that represents a set of random variables and their conditional dependencies via a directed acyclic graph . For example, a Bayesian network could represent the probabilistic...

represents a system of probabilistic events as nodes in a directed acyclic graph. The likelihood of an event may be calculated from the likelihoods of its predecessors in the DAG. In this context, the moral graph
Moral graph
A moral graph is a concept in graph theory, used to find the equivalent undirected form of a directed acyclic graph. It is a key step of the junction tree algorithm, used in belief propagation on graphical models....

of a DAG is the undirected graph created by adding an (undirected) edge between all parents of the same node (sometimes called marrying), and then replacing all directed edges by undirected edges.
• Dataflow
Dataflow
Dataflow is a term used in computing, and may have various shades of meaning. It is closely related to message passing.-Software architecture:...

programming languages describe systems of values that are related to each other by a directed acyclic graph. When one value changes, its successors are recalculated; each value is evaluated as a function of its predecessors in the DAG.
• In compiler
Compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...

s, straight line code (that is, sequences of statements without loops or conditional branches) may be represented by a DAG describing the inputs and outputs of each of the arithmetic operations performed within the code; this representation allows the compiler to perform common subexpression elimination
Common subexpression elimination
In computer science, common subexpression elimination is a compiler optimization that searches for instances of identical expressions , and analyses whether it is worthwhile replacing them with a single variable holding the computed value.- Example :In the following code: a = b * c + g; d = b * c...

efficiently.

## Paths with shared structure

A third type of application of directed acyclic graphs arises in representing a set of sequence
Sequence
In mathematics, a sequence is an ordered list of objects . Like a set, it contains members , and the number of terms is called the length of the sequence. Unlike a set, order matters, and exactly the same elements can appear multiple times at different positions in the sequence...

s as paths
Path (graph theory)
In graph theory, a path in a graph is a sequence of vertices such that from each of its vertices there is an edge to the next vertex in the sequence. A path may be infinite, but a finite path always has a first vertex, called its start vertex, and a last vertex, called its end vertex. Both of them...

in a graph. For example, the directed acyclic word graph
Directed acyclic word graph
In computer science, a directed acyclic word graph is a data structure that represents a set of strings, and allows for a query operation that tests whether a given string belongs to the set in time proportional to its length...

is a data structure
Data structure
In computer science, a data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently.Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks...

in computer science formed by a directed acyclic graph with a single source and with edges labeled by letters or symbols; the paths from the source to the sinks in this graph represent a set of strings
String (computer science)
In formal languages, which are used in mathematical logic and theoretical computer science, a string is a finite sequence of symbols that are chosen from a set or alphabet....

, such as English words.
Any set of sequences can be represented as paths in a tree, by forming a tree node for every prefix of a sequence and making the parent of one of these nodes represent the sequence with one fewer element; the tree formed in this way for a set of strings is called a trie
Trie
In computer science, a trie, or prefix tree, is an ordered tree data structure that is used to store an associative array where the keys are usually strings. Unlike a binary search tree, no node in the tree stores the key associated with that node; instead, its position in the tree defines the...

. A directed acyclic word graph saves space over a trie by allowing paths to diverge and rejoin, so that a set of words with the same possible suffixes can be represented by a single tree node.

The same idea of using a DAG to represent a family of paths occurs in the binary decision diagram
Binary decision diagram
In the field of computer science, a binary decision diagram or branching program, like a negation normal form or a propositional directed acyclic graph , is a data structure that is used to represent a Boolean function. On a more abstract level, BDDs can be considered as a compressed...

, a DAG-based data structure for representing binary functions. In a binary decision diagram, each non-sink vertex is labeled by the name of a binary variable, and each sink and each edge is labeled by a 0 or 1. The function value for any truth assignment to the variables is the value at the sink found by following a path, starting from the single source vertex, that at each non-sink vertex follows the outgoing edge labeled with the value of that vertex's variable. Just as directed acyclic word graphs can be viewed as a compressed form of tries, binary decision diagrams can be viewed as compressed forms of decision tree
Decision tree
A decision tree is a decision support tool that uses a tree-like graph or model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. It is one way to display an algorithm. Decision trees are commonly used in operations research, specifically...

s that save space by allowing paths to rejoin when they agree on the results of all remaining decisions.

In many randomized
Randomization
Randomization is the process of making something random; this means:* Generating a random permutation of a sequence .* Selecting a random sample of a population ....

algorithm
Algorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...

s in computational geometry
Computational geometry
Computational geometry is a branch of computer science devoted to the study of algorithms which can be stated in terms of geometry. Some purely geometrical problems arise out of the study of computational geometric algorithms, and such problems are also considered to be part of computational...

, the algorithm maintains a history DAG representing features of some geometric construction that have been replaced by later finer-scale features; point location
Point location
The point location problem is a fundamental topic of computational geometry. It finds applications in areas that deal with processing geometrical data: computer graphics, geographic information systems , motion planning, and computer aided design ....

queries may be answered, as for the above two data structures, by following paths in this DAG.

## Relation to other kinds of graphs

A polytree
Polytree
In graph theory, a polytree is a directed graph with at most one undirected path between any two vertices. In other words, a polytree is a directed acyclic graph for which there are no undirected cycles either...

is a directed graph formed by orienting the edges of a free tree
Tree (graph theory)
In mathematics, more specifically graph theory, a tree is an undirected graph in which any two vertices are connected by exactly one simple path. In other words, any connected graph without cycles is a tree...

. Every polytree is a DAG. In particular, this is true of the arborescence
Arborescence (graph theory)
In graph theory, an arborescence is a directed graph in which, for a vertex u called the root and any other vertex v, there is exactly one directed path from u to v....

s formed by directing all edges outwards from the root of a tree. A multitree
Multitree
In combinatorics and order-theoretic mathematics, a multitree may describe either of two equivalent structures: a directed acyclic graph in which the set of nodes reachable from any node form a tree, or a partially ordered set that does not have four items a, b, c, and d forming a diamond suborder...

(also called a strongly ambiguous graph or a mangrove) is a directed graph in which there is at most one directed path (in either direction) between any two nodes; equivalently, it is a DAG in which, for every node v, the set of nodes reachable from v forms a tree.

Any undirected graph may be made into a DAG by choosing a total order
Total order
In set theory, a total order, linear order, simple order, or ordering is a binary relation on some set X. The relation is transitive, antisymmetric, and total...

for its vertices and orienting every edge from the earlier endpoint in the order to the later endpoint. However, different total orders may lead to the same acyclic orientation. The number of acyclic orientations is equal to |χ(-1)|, where χ is the chromatic polynomial
Chromatic polynomial
The chromatic polynomial is a polynomial studied in algebraic graph theory, a branch of mathematics. It counts the number of graph colorings as a function of the number of colors and was originally defined by George David Birkhoff to attack the four color problem. It was generalised to the Tutte...

of the given graph.

Any directed graph may be made into a DAG by removing a feedback vertex set
Feedback vertex set
In the mathematical discipline of graph theory, a feedback vertex set of a graph is a set of vertices whose removal leaves a graph without cycles. In other words, each feedback vertex set contains at least one vertex of any cycle in the graph....

or a feedback arc set
Feedback arc set
In graph theory, a directed graph may contain directed cycles, a one-way loop of edges. In some applications, such cycles are undesirable, and we wish to eliminate them and obtain a directed acyclic graph . One way to do this is simply to drop edges from the graph to break the cycles...

. However, the smallest such set is NP-hard
NP-hard
NP-hard , in computational complexity theory, is a class of problems that are, informally, "at least as hard as the hardest problems in NP". A problem H is NP-hard if and only if there is an NP-complete problem L that is polynomial time Turing-reducible to H...

to find. An arbitrary directed graph may also be transformed into a DAG, called its condensation, by contracting each of its strongly connected component
Strongly connected component
A directed graph is called strongly connected if there is a path from each vertex in the graph to every other vertex. In particular, this means paths in each direction; a path from a to b and also a path from b to a....

s into a single supervertex. When the graph is already acyclic, its smallest feedback vertex sets and feedback arc sets are empty
Empty set
In mathematics, and more specifically set theory, the empty set is the unique set having no elements; its size or cardinality is zero. Some axiomatic set theories assure that the empty set exists by including an axiom of empty set; in other theories, its existence can be deduced...

, and its condensation is the graph itself.

## Enumeration

The graph enumeration
Graph enumeration
In combinatorics, an area of mathematics, graph enumeration describes a class of combinatorial enumeration problems in which one must count undirected or directed graphs of certain types, typically as a function of the number of vertices of the graph...

problem of counting directed acyclic graphs was studied by .
The number of DAGs on n labeled nodes, for n = 1, 2, 3, ..., is
1, 3, 25, 543, 29281, 3781503, ... .

These numbers may be computed by the recurrence relation
Recurrence relation
In mathematics, a recurrence relation is an equation that recursively defines a sequence, once one or more initial terms are given: each further term of the sequence is defined as a function of the preceding terms....

Eric W. Weisstein
Eric W. Weisstein
Eric W. Weisstein is an encyclopedist who created and maintains MathWorld and Eric Weisstein's World of Science . He currently works for Wolfram Research, Inc.-Education:...

conjectured, and proved, that the same numbers count the (0,1) matrices in which all eigenvalues are positive real number
Real number
In mathematics, a real number is a value that represents a quantity along a continuum, such as -5 , 4/3 , 8.6 , √2 and π...

s. The proof is bijective
Bijective proof
In combinatorics, bijective proof is a proof technique that finds a bijective function f : A → B between two sets A and B, thus proving that they have the same number of elements, |A| = |B|. One place the technique is useful is where we wish to know the size of A, but can find no direct way of...

: a matrix A is an adjacency matrix
In mathematics and computer science, an adjacency matrix is a means of representing which vertices of a graph are adjacent to which other vertices...

of a DAG if and only if the eigenvalues of the (0,1) matrix A + I are positive, where I denotes the identity matrix
Identity matrix
In linear algebra, the identity matrix or unit matrix of size n is the n×n square matrix with ones on the main diagonal and zeros elsewhere. It is denoted by In, or simply by I if the size is immaterial or can be trivially determined by the context...

.