All Topics  
Directed acyclic graph

 

   Email Print
   Bookmark   Link






 

Directed acyclic graph



 
 
In computer science
Computer science

Computer science is the study of the theoretical foundations of information and computation, and of practical techniques for their implementation and application in computer systems....
 and mathematics
Mathematics

Mathematics is the study of quantity, structure, space, change, and related topics of pattern and form. Mathematicians seek out patterns whether found in numbers, space, natural science, computers, imaginary abstractions, or elsewhere....
, a directed acyclic graph, also called a DAG, is a with no ; that is, for any vertex
Vertex (graph theory)

In graph theory, a vertex or node is the fundamental unit out of which graphs are formed: an undirected graph consists of a set of vertices and a set of edges , while a directed graph consists of a set of vertices and a set of arcs ....
 v, there is no nonempty directed path that starts and ends on v. Informally speaking, a DAG "flows" in a single direction.

Each directed acyclic graph gives rise to a partial order = on its vertices, where u = v exactly when there exists a directed path from u to v in the DAG.






Discussion
Ask a question about 'Directed acyclic graph'
Start a new discussion about 'Directed acyclic graph'
Answer questions from other users
Full Discussion Forum



Recent Posts









Encyclopedia


Directed Acyclic Graph
In computer science
Computer science

Computer science is the study of the theoretical foundations of information and computation, and of practical techniques for their implementation and application in computer systems....
 and mathematics
Mathematics

Mathematics is the study of quantity, structure, space, change, and related topics of pattern and form. Mathematicians seek out patterns whether found in numbers, space, natural science, computers, imaginary abstractions, or elsewhere....
, a directed acyclic graph, also called a DAG, is a with no ; that is, for any vertex
Vertex (graph theory)

In graph theory, a vertex or node is the fundamental unit out of which graphs are formed: an undirected graph consists of a set of vertices and a set of edges , while a directed graph consists of a set of vertices and a set of arcs ....
 v, there is no nonempty directed path that starts and ends on v. Informally speaking, a DAG "flows" in a single direction.

Each directed acyclic graph gives rise to a partial order = on its vertices, where u = v exactly when there exists a directed path from u to v in the DAG. However, many different DAGs may give rise to this same reachability
Reachability

In graph theory, reachability is the notion of being able to get from one Vertex in a directed graph to some other vertex. Note that reachability in undirected graphs is trivial--it is sufficient to find the connected components in the graph, which can be done in linear time....
 relation. Among all such DAGs, the one with the fewest edges is the transitive reduction
Transitive reduction

In mathematics, a transitive reduction of a binary relation R on a Set X is a minimal relation on X such that the transitive closure of is the same as the transitive closure of R....
 of each of them and the one with the most is their transitive closure
Transitive closure

In mathematics, the transitive closure of a binary relation R on a Set X is the smallest transitive relation on X that contains R....
. In particular, the transitive closure is the reachability order =.

Terminology


A source is a vertex with no incoming edges, while a sink is a vertex with no outgoing edges. A finite DAG must have at least one source and at least one sink.

The depth of a vertex in a finite DAG is the length of the longest path from a source to that vertex, while its height is the length of the longest path from that vertex to a sink.

The length of a finite DAG is the length (number of edges) of a longest directed path. It is equal to the maximum height of all sources and equal to the maximum depth of all sinks.

Properties


Every directed acyclic graph has a topological sort, an ordering of the vertices such that each vertex comes before all vertices it has edges to. In general, this ordering is not unique. Any two graphs representing the same partial order have the same set of topological sort orders.

DAGs can be considered to be a generalization of tree
Tree (graph theory)

In mathematics, more specifically graph theory, a tree is a graph in which any two Vertex are connected by exactly one path . Alternatively, any connectedness graph with no Cycle is a tree....
s in which certain subtrees can be shared by different parts of the tree. In a tree with many identical subtrees, this can lead to a drastic decrease in space requirements to store the structure. Conversely, a DAG can be expanded to a forest of rooted trees using this simple algorithm:
  • While there is a vertex v with in-degree n > 1,
    • Make n copies of v, each with the same outgoing edges but no incoming edges.
    • Attach one of the incoming edges of v to each vertex.
    • Delete v.
If we explore the graph without modifying it or comparing nodes for equality, this forest will appear identical to the original DAG.

Some algorithms become simpler when used on DAGs instead of general graphs. For example, search algorithms like depth-first search
Depth-first search

Depth-first search is an algorithm for traversing or searching a tree data structure, tree structure, or graph . One starts at the root and explores as far as possible along each branch before backtracking....
 without iterative deepening normally must mark vertices they have already visited and not visit them again. If they fail to do this, they may never terminate because they follow a cycle of edges forever. Such cycles do not exist in DAGs (marking is still a good idea as it reduces the worst-case performance from exponential (due to multiple paths) to linear).

A polytree
Polytree

In graph theory, a polytree is a Graph with at most one undirected path between any two vertices. In other words, a polytree is a directed acyclic graph for which there are no undirected cycles either....
 is a specifically efficient kind of DAG, with many tree-like properties. Its efficiency is exploited, for example, in the belief propagation
Belief propagation

Belief propagation, also known as the sum-product algorithm, is an iterative algorithm for computing Marginal distributions of functions on a graphical model most commonly used in artificial intelligence and information theory....
 algorithm for Bayesian network
Bayesian network

A Bayesian network is a probabilistic graphical model that represents a set of variables and their probabilistic independencies. For example, a Bayesian network could represent the probabilistic relationships between diseases and symptoms....
s.

The number of non-isomorphic DAGs is obtained by Weisstein's conjecture: the number of labeled DAGs on n vertices is equal to the number of nxn matrices with entries from (i.e. adjascency matrices) and with only positive real eigenvalues, proved by McKay et al. .

Any directed graph may be made into a DAG by removing a feedback arc set
Feedback arc set

In graph theory, a directed graph may contain directed cycles, a one-way loop of edges. In some applications, such cycles are undesirable, and we wish to eliminate them and obtain a directed acyclic graph ....
.

Applications


Directed acyclic graphs have many important applications in computer science
Computer science

Computer science is the study of the theoretical foundations of information and computation, and of practical techniques for their implementation and application in computer systems....
, including:
  • Parse tree
    Parse tree

    A parse tree or concrete syntax tree is an tree that represents the syntax structure of a string according to some formal grammar. In a parse tree, the interior nodes are labeled by nonterminals of the grammar, while the leaf nodes are labeled by terminal symbol of the grammar....
    s constructed by compilers
  • Bayesian network
    Bayesian network

    A Bayesian network is a probabilistic graphical model that represents a set of variables and their probabilistic independencies. For example, a Bayesian network could represent the probabilistic relationships between diseases and symptoms....
    s
  • Reference graphs that can be garbage collected
    Garbage collection (computer science)

    In computer science, garbage collection is a form of automatic memory management. The garbage collector, or just collector, attempts to reclaim garbage , or memory used by Object that will never be accessed or mutated again by the Application software....
     using simple reference counting
    Reference counting

    In computer science, reference counting is a technique of storing the number of references, pointers, or handles to a resource such as an object or block of memory....
  • Feedforward neural networks, and other feed-forward controller or classifier topologies
  • Reference graphs of purely functional
    Purely functional

    Purely functional is a term in computing used to describe algorithms, data structures or programming languages that exclude destructive modifications ....
     data structure
    Data structure

    A data structure in computer science is a way of storing data in a computer so that it can be used efficiently. It is an organization of mathematical and logical concepts of data....
    s (although some languages allow purely functional cyclic structures)
  • Dependency graphs such as those used in instruction scheduling
    Instruction scheduling

    In computer science, instruction scheduling is a compiler optimization used to improve instruction-level parallelism, which improves performance on machines with instruction pipelines....
     and makefiles
  • Dependency graphs between class
    Class (computer science)

    In object-oriented programming, a class is a programming language construct that is used as a blueprint to create Object s. This blueprint includes Attribute s and Method s that the created objects all share....
    es formed by inheritance
    Inheritance (computer science)

    In object-oriented programming, inheritance is a way to form new class es using classes that have already been defined. The inheritance concept was invented in 1967 for Simula....
     relationships in object-oriented programming
    Object-oriented programming

    Object-oriented programming is a programming paradigm that uses "Object_" and their interactions to design applications and computer programs....
     language
    Programming language

    A programming language is a machine-readable artificial language designed to express computations that can be performed by a machine, particularly a computer....
    s
  • Serializability
    Serializability

    In databases and transaction processing, a Schedule is serializable, has the Serializability property, if its outcome is equal to the outcome of its transactions executed serially, i.e., sequentially without overlapping in time....
     Theory of Transaction Processing System
    Transaction Processing System

    A Transaction Processing System or Transaction Processing Monitor is a system that monitors transaction programs . The essence of a transaction program is that it manages data that must be left in a consistent state....
    s
  • Information categorization systems, such as file system directories
    Directory (file systems)

    In computing, a directory, folder, catalog, or drawer is a virtual container within a digital file system, in which groups of files and other directories can be kept and organized....
  • Hierarchical scene graph
    Scene graph

    A scene graph is a general data structure commonly used by vector graphics applications and modern computer games. Examples of such programs include AutoCAD, Adobe Illustrator, Acrobat 3D, OpenSceneGraph and CorelDRAW....
    s to optimise view frustum
    Viewing frustum

    In 3D computer graphics, the viewing frustum or view frustum is the region of space in the modeled world that may appear on the screen; it is the field of view of the notional camera....
     culling operations
  • Forward chained rules systems (including business rules engine
    Business rules engine

    A business rules engine is a software system that executes one or more business rules in a runtime production environment. The rules might come from legal regulation , company policy , or other sources....
    s) such as the Rete algorithm
    Rete algorithm

    The Rete algorithm is an efficient pattern matching algorithm for implementing production rule systems. The Rete algorithm was designed by Dr Charles Forgy of Carnegie Mellon University, first published in a working paper in 1974, and later elaborated in his 1979 Ph.D....
    , used by the rule engine Drools
    Drools

    Drools is a BRMS with a forward chaining inference based rules engine, more correctly known as a production rule system, using an enhanced implementation of the Rete algorithm....
    .
  • Representing spacetime as a causal set in theoretical physics
    Theoretical physics

    Theoretical physics employs mathematical models and abstractions of physics in an attempt to explain experimental data taken of the natural world....
  • In bioinformatics
    Bioinformatics

    Bioinformatics is the application of information technology to the field of molecular biology. The term bioinformatics was coined by Paulien Hogeweg in 1978 for the study of informatic processes in biotic systems....
    , finding areas of synteny
    Synteny

    In classical genetics, synteny describes the physical co-localization of Locus on the same chromosome within an individual or species. The concept is related to genetic linkage: Linkage between two loci is established by the observation of lower-than-expected recombination frequencies between two or more loci....
     between two genome
    Genome

    In classical genetics, the genome of a diploid organism including eukarya refers to a full set of chromosomes or genes in a gamete; thereby, a regular somatic cell contains two full sets of genomes....
    s, or representing evolution as phylogenetic network
    Phylogenetic network

    A phylogenetic network is any Graph used to visualize evolutionary relationships between species or organisms. It is employed when reticulate events such as Hybrid , horizontal gene transfer, recombination, or gene duplication and loss are believed to be involved....
    s
  • Abstract process descriptions such as workflows
    Workflow

    A workflow is a depiction of a sequence of operations, declared as work of a person, work of a simple or complex mechanism, work of a group of persons, work of an organization of staff, or machines....
     and some models of provenance
    Provenance

    Provenance, from the French provenir, "to come from", means the origin, or the wiktionary:Source, of something, or the history of the ownership or location of an object, The term was originally mostly used of works of art, but is now used in similar senses in a wide range of fields, including science and computing....
  • A pattern language
    Pattern language

    A pattern language is a structured method of describing good design practices within a field of expertise. It is characterized by # Noticing and naming the common problems in a field of interest,...
     as a DAG of patterns
  • Dynamic programming
    Dynamic programming

    In mathematics and computer science, dynamic programming is a method of solving problems that exhibit the properties of overlapping subproblems and optimal substructure ....
  • Optical character recognition
    Optical character recognition

    Optical character recognition, usually abbreviated to OCR, is the mechanical or Electronics translation of s of handwritten, typewritten or printed text into machine-editable text....
  • Representing Feature structure
    Feature structure

    In phrase structure grammars, such as generalised phrase structure grammar, head-driven phrase structure grammar and lexical functional grammar, a feature structure is essentially a set of attribute-value pairs....
    s in phrase-structure linguistic formalisms


Directed acyclic word graph

A directed acyclic word graph (DAWG) is a data structure
Data structure

A data structure in computer science is a way of storing data in a computer so that it can be used efficiently. It is an organization of mathematical and logical concepts of data....
 in computer science similar to a trie
Trie

In computer science, a trie, or prefix tree, is an ordered tree data structure data structure that is used to store an associative array where the keys are usually string s....
 but much more space efficient. It is used to represent a set of strings
String (computer science)

In computer programming and some branches of mathematics, a string is an ordered sequence of symbols. These symbols are chosen from a predetermined set or alphabet....
 and supports a constant time search operation. The lookup time is proportional to the length of the search string and is the same as an equivalent trie.

A DAWG is defined as a trie where isomorphic subtrees are identified, thus producing a directed acyclic graph instead of a tree structure
Tree structure

A tree structure is a way of representing the hierarchy nature of a structure in a graphical form.It is named a "tree structure" because the graph looks a bit like a tree, even though the tree is generally shown upside down compared with a real tree; that is to say with the root at the top and the leaves at the bottom....
. Each node in the graph represents a unique substring
Substring

A subsequence, substring, prefix or suffix of a String is a subset of the symbols in a string, where the order of the elements is preserved....
. Each outgoing edge from one node to the next is labeled with a letter and represents appending that letter to the substring represented by the first node to get the substring represented by the second node. There is one node having zero incoming edges; it represents the empty string. Similarly, the nodes representing entire strings that are not a substring of any other string (this can always be guaranteed by appending an otherwise unused character such as $ to the end of every string) have zero outgoing edges.

The primary difference between DAWG and trie is the elimination of suffix redundancy in storing strings. The trie eliminates prefix redundancy since all common prefixes are shared between strings, such as between doctors and doctorate the doctor prefix is shared. In a DAWG common suffixes are also shared, such as between desertion and desertification both the prefix deserti- and suffix -tion are shared. For dictionary sets of common English words, this translates into major memory usage reduction.

Acyclic deterministic finite automata (ADFA) are deterministic finite automata without cycles. In other words, they can only represent finite sets of strings. They can be used as a data structure for word storage with extremely fast search performance. Minimized ADFA can be very compact as well. The size of a minimized ADFA does not directly depend on the number of keys stored. In fact, after a certain point, as more words are stored in a minimized ADFA, its size can begin to decrease. Its size would actually appear to be related to how complex the set of strings is. A trie
Trie

In computer science, a trie, or prefix tree, is an ordered tree data structure data structure that is used to store an associative array where the keys are usually string s....
 is a type of ADFA.

External links

  • by Seth J. Chandler with Matthew Szudzik and Jesse Nochella, The Wolfram Demonstrations Project.
  • By JohnPaul Adamovsky (Aerospace Engineer)


de:Gerichteter azyklischer Graph fr:Graphe acyclique orienté hr:Aciklicki deterministicki konacni automat hu:Irányított körmentes gráf pl:Skierowany graf acykliczny pt:Grafos acíclicos dirigidos ru:DAWG