Editing Directed acyclic graph (section)

== Applications ==
=== Scheduling ===
Directed acyclic graph representations of partial orderings have many applications in [[Schedule|scheduling]] for systems of tasks with ordering constraints.<ref>{{harvtxt|Skiena|2009}}, p. 469.</ref>
An important class of problems of this type concern collections of objects that need to be updated, such as the cells of a [[spreadsheet]] after one of the cells has been changed, or the [[object file]]s of a piece of computer software after its [[source code]] has been changed.
In this context, a [[dependency graph]] is a graph that has a vertex for each object to be updated, and an edge connecting two objects whenever one of them needs to be updated earlier than the other. A cycle in this graph is called a [[circular dependency]], and is generally not allowed, because there would be no way to consistently schedule the tasks involved in the cycle.
Dependency graphs without circular dependencies form DAGs.<ref>{{citation | last1=Al-Mutawa | first1=H. A. | last2=Dietrich | first2=J. | last3=Marsland | first3=S. | last4=McCartin | first4=C. | contribution=On the shape of circular dependencies in Java programs | doi=10.1109/ASWEC.2014.15 | pages=48–57 | publisher=IEEE | title=23rd Australian Software Engineering Conference | year=2014| isbn=978-1-4799-3149-1 | s2cid=17570052 }}.</ref>

For instance, when one cell of a [[spreadsheet]] changes, it is necessary to recalculate the values of other cells that depend directly or indirectly on the changed cell. For this problem, the tasks to be scheduled are the recalculations of the values of individual cells of the spreadsheet. Dependencies arise when an expression in one cell uses a value from another cell. In such a case, the value that is used must be recalculated earlier than the expression that uses it. Topologically ordering the dependency graph, and using this topological order to schedule the cell updates, allows the whole spreadsheet to be updated with only a single evaluation per cell.<ref name="hgt1181">{{citation |title=Handbook of Graph Theory |first1=Jonathan L. |last1=Gross |first2=Jay |last2=Yellen |first3=Ping |last3=Zhang  | author3-link = Ping Zhang (graph theorist)|edition=2nd |publisher=CRC Press |year=2013 |isbn=978-1-4398-8018-0 |page=1181 |url=https://books.google.com/books?id=cntcAgAAQBAJ&pg=PA1181}}.</ref> Similar problems of task ordering arise in [[makefile]]s for program compilation<ref name="hgt1181" /> and [[instruction scheduling]] for low-level computer program optimization.<ref>{{citation |title=The Compiler Design Handbook: Optimizations and Machine Code Generation |first1=Y. N. |last1=Srikant |first2=Priti |last2=Shankar |edition=2nd |publisher=CRC Press|year=2007 |isbn=978-1-4200-4383-9 |pages=19–39 |url=https://books.google.com/books?id=1kqAv-uDEPEC&pg=SA19-PA39}}.</ref>

[[File:Pert chart colored.svg|thumb|PERT chart for a project with five milestones (labeled 10–50) and six tasks (labeled A–F). There are two critical paths, ADF and BC.]]
A somewhat different DAG-based formulation of scheduling constraints is used by the [[program evaluation and review technique]] (PERT), a method for management of large human projects that was one of the first applications of DAGs. In this method, the vertices of a DAG represent [[Milestone (project management)|milestones]] of a project rather than specific tasks to be performed. Instead, a task or activity is represented by an edge of a DAG, connecting two milestones that mark the beginning and completion of the task. Each such edge is labeled with an estimate for the amount of time that it will take a team of workers to perform the task. The [[Longest path problem|longest path]] in this DAG represents the [[Critical path method|critical path]] of the project, the one that controls the total time for the project. Individual milestones can be scheduled according to the lengths of the longest paths ending at their vertices.<ref>{{citation |title=What Every Engineer Should Know About Decision Making Under Uncertainty |first=John X. |last=Wang |publisher=CRC Press |year=2002 |isbn=978-0-8247-4373-4 |page=160 |url=https://books.google.com/books?id=C3yKML0dUVIC&pg=PA160}}.</ref>

=== Data processing networks ===
A directed acyclic graph may be used to represent a network of processing elements. In this representation, data enters a processing element through its incoming edges and leaves the element through its outgoing edges.

For instance, in electronic circuit design, static [[combinational logic]] blocks can be represented as an acyclic system of [[logic gate]]s that computes a function of an input, where the input and output of the function are represented as individual [[bit]]s. In general, the output of these blocks cannot be used as the input unless it is captured by a register or state element which maintains its acyclic properties.<ref>{{citation|title=Timing|first=Sachin|last=Sapatnekar|publisher=Springer|year=2004|isbn=978-1-4020-7671-8|page=133|url=https://books.google.com/books?id=fL9k-VkZVr0C&pg=PA133}}.</ref> Electronic circuit schematics either on paper or in a database are a form of directed acyclic graphs using instances or components to form a directed reference to a lower level component. Electronic circuits themselves are not necessarily acyclic or directed.

[[Dataflow programming]] languages describe systems of operations on [[data stream]]s, and the connections between the outputs of some operations and the inputs of others. These languages can be convenient for describing repetitive data processing tasks, in which the same acyclically-connected collection of operations is applied to many data items. They can be executed as a [[parallel algorithm]] in which each operation is performed by a parallel process as soon as another set of inputs becomes available to it.<ref>{{citation|title=Programming Symposium|series=Lecture Notes in Computer Science|volume=19|year=1974|pages=362–376|contribution=First version of a data flow procedure language|first=Jack B.|last=Dennis|doi=10.1007/3-540-06859-7_145|isbn=978-3-540-06859-4}}.</ref>

In [[compiler]]s, straight line code (that is, sequences of statements without loops or conditional branches) may be represented by a DAG describing the inputs and outputs of each of the arithmetic operations performed within the code. This representation allows the compiler to perform [[common subexpression elimination]] efficiently.<ref>{{citation|title=Advanced Backend Optimization|first1=Sid|last1=Touati|first2=Benoit|last2=de Dinechin|publisher=John Wiley & Sons|year=2014|isbn=978-1-118-64894-0|page=123|url=https://books.google.com/books?id=nO2-AwAAQBAJ&pg=PA123}}.</ref> At a higher level of code organization, the [[acyclic dependencies principle]] states that the dependencies between modules or components of a large software system should form a directed acyclic graph.<ref>{{citation|title=Large-Scale Software Architecture: A Practical Guide using UML|first1=Jeff|last1=Garland|first2=Richard|last2=Anthony|publisher=John Wiley & Sons|year=2003|isbn=9780470856383|page=215|url=https://books.google.com/books?id=_2oQLLSqZ88C&pg=PA215}}.</ref>

[[Feedforward neural network]]s are another example.

=== Causal structures ===
{{main|Bayesian network}}
Graphs in which vertices represent events occurring at a definite time, and where the edges always point from an earlier time vertex to a later time vertex, are necessarily directed and acyclic. The lack of a cycle follows because the time associated with a vertex always increases as you follow any directed [[Path (graph theory)|path]] in the graph, so you can never return to a vertex on a path.  This reflects our natural intuition that causality means events can only affect the future, they never affect the past, and thus we have no [[causal loop]]s. An example of this type of directed acyclic graph are those encountered in the [[Causal sets|causal set approach to quantum gravity]] though in this case the graphs considered are [[#Transitive closure and transitive reduction|transitively complete]]. In the version history example below, each version of the software is associated with a unique time, typically the time the version was saved, committed or released. In the citation graph examples below, the documents are published at one time and can only refer to older documents.

Sometimes events are not associated with a specific physical time. Provided that pairs of events have a purely causal relationship, that is edges represent [[causality|causal relations]] between the events, we will have a directed acyclic graph.<ref>{{citation|title=Causal Learning|first1=Alison|last1=Gopnik|author-link=Alison Gopnik |first2=Laura|last2=Schulz|author2-link=Laura Schulz |publisher=Oxford University Press|year=2007|isbn=978-0-19-803928-0|page=4|url=https://books.google.com/books?id=35MKXlKoXIUC&pg=PA4}}.</ref> For instance, a [[Bayesian network]] represents a system of probabilistic events as vertices in a directed acyclic graph, in which the likelihood of an event may be calculated from the likelihoods of its predecessors in the DAG.<ref>{{citation|title=Probabilistic Boolean Networks: The Modeling and Control of Gene Regulatory Networks|publisher=Society for Industrial and Applied Mathematics|first1=Ilya|last1=Shmulevich|first2=Edward R.|last2=Dougherty|year=2010|isbn=978-0-89871-692-4|page=58|url=https://books.google.com/books?id=RfshqEgO7KgC&pg=PA58}}.</ref> In this context, the [[moral graph]] of a DAG is the undirected graph created by adding an (undirected) edge between all parents of the same vertex (sometimes called ''marrying''), and then replacing all directed edges by undirected edges.<ref>{{citation |last1= Cowell |first1= Robert G. |author2-link=Philip Dawid|last2=Dawid|first2=A. Philip|author3-link=Steffen Lauritzen|last3=Lauritzen|first3=Steffen L.|author4-link=David Spiegelhalter|last4=Spiegelhalter|first4=David J.|title= Probabilistic Networks and Expert Systems |publisher= Springer |year= 1999 |isbn= 978-0-387-98767-5 |chapter= 3.2.1 Moralization|pages= 31–33 }}.</ref> Another type of graph with a similar causal structure is an [[influence diagram]], the vertices of which represent either decisions to be made or unknown information, and the edges of which represent causal influences from one vertex to another.<ref>{{citation|title=The Technology Management Handbook|first=Richard C.|last=Dorf|publisher=CRC Press|year=1998|isbn=978-0-8493-8577-3|page=9{{hyphen}}7<!-- Do not conver this hyphen into a dash! It is a section-page number, not a range of page numbers. -->|url=https://books.google.com/books?id=C2u8I0DFo4IC&pg=SA9-PA7}}.</ref> In [[epidemiology]], for instance, these diagrams are often used to estimate the expected value of different choices for intervention.<ref>{{citation|title=Encyclopedia of Epidemiology, Volume 1|first=Sarah|last=Boslaugh|publisher=SAGE|year=2008|isbn=978-1-4129-2816-8|page=255|url=https://books.google.com/books?id=wObgnN3x14kC&pg=PA255}}.</ref><ref name="pearl:95">{{citation | last = Pearl | first = Judea | doi = 10.1093/biomet/82.4.669 | issue = 4 | journal = Biometrika | pages = 669–709 | title = Causal diagrams for empirical research | volume = 82 | year = 1995| url = https://escholarship.org/uc/item/6gv9n38c }}.</ref>

The converse is also true. That is in any application represented by a directed acyclic graph there is a causal structure, either an explicit order or time in the example or an order which can be derived from graph structure. This follows because all directed acyclic graphs have a [[#Topological ordering|topological ordering]], i.e. there is at least one way to put the vertices in an order such that all edges point in the same direction along that order.

=== Genealogy and version history ===
[[File:EgyptianPtolemies2.jpg|thumb|upright=1.5|Family tree of the [[Ptolemaic dynasty]], with many marriages between [[Consanguinity|close relatives]] causing [[pedigree collapse]].]]
[[Family tree]]s may be seen as directed acyclic graphs, with a vertex for each family member and an edge for each parent-child relationship.<ref>{{citation|journal=Algorithms for Molecular Biology|date=April 2011|volume=6|issue=10|pages=10|title=Haplotypes versus genotypes on pedigrees|first=Bonnie B.|last=Kirkpatrick|doi=10.1186/1748-7188-6-10|pmc=3102622|pmid=21504603 |doi-access=free }}.</ref> Despite the name, these graphs are not necessarily trees because of the possibility of marriages between relatives (so a child has a common ancestor on both the mother's and father's side) causing [[pedigree collapse]].<ref>{{citation
 | last1 = McGuffin | first1 = M. J.
 | last2 = Balakrishnan | first2 = R.
 | contribution = Interactive visualization of genealogical graphs
 | contribution-url = http://profs.etsmtl.ca/mMcGuffin/research/genealogyVis/genealogyVis.pdf
 | doi = 10.1109/INFVIS.2005.1532124
 | pages = 16–23
 | title = IEEE Symposium on Information Visualization (INFOVIS 2005)
 | year = 2005| isbn = 978-0-7803-9464-3
 | s2cid = 15449409
 }}.</ref> The graphs of [[matrilineal]] descent (mother-daughter relationships) and [[patrilineal]] descent (father-son relationships) are trees within this graph. Because no one can become their own ancestor, family trees are acyclic.<ref>{{citation
 | last1 = Bender | first1 = Michael A.
 | last2 = Pemmasani | first2 = Giridhar
 | last3 = Skiena | first3 = Steven
 | last4 = Sumazin | first4 = Pavel
 | contribution = Finding least common ancestors in directed acyclic graphs
 | contribution-url = http://dl.acm.org/citation.cfm?id=365411.365795
 | isbn = 978-0-89871-490-6
 | location = Philadelphia, PA, USA
 | pages = 845–854
 | publisher = Society for Industrial and Applied Mathematics
 | title = Proceedings of the Twelfth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA '01)
 | year = 2001}}.</ref>

The version history of a [[distributed revision control]] system, such as [[Git]], generally has the structure of a directed acyclic graph, in which there is a vertex for each revision and an edge connecting pairs of revisions that were directly derived from each other. These are not trees in general due to merges.<ref>{{citation|title=Architecture and Methods for Flexible Content Management in Peer-to-Peer Systems|first=Udo|last=Bartlang|publisher=Springer|year=2010|isbn=978-3-8348-9645-2|page=59|bibcode=2010aamf.book.....B|url=https://books.google.com/books?id=vXdEAAAAQBAJ&pg=PA59}}.</ref>

In many [[randomization|randomized]] [[algorithm]]s in [[computational geometry]], the algorithm maintains a ''history DAG'' representing the version history of a geometric structure over the course of a sequence of changes to the structure. For instance in a [[Randomized algorithm#Randomized incremental constructions in geometry|randomized incremental]] algorithm for [[Delaunay triangulation]], the triangulation changes by replacing one triangle by three smaller triangles when each point is added, and by "flip" operations that replace pairs of triangles by a different pair of triangles. The history DAG for this algorithm has a vertex for each triangle constructed as part of the algorithm, and edges from each triangle to the two or three other triangles that replace it. This structure allows [[point location]] queries to be answered efficiently: to find the location of a query point {{mvar|q}} in the Delaunay triangulation, follow a path in the history DAG, at each step moving to the replacement triangle that contains {{mvar|q}}. The final triangle reached in this path must be the Delaunay triangle that contains {{mvar|q}}.<ref>{{citation|title=Combinatorial Geometry and Its Algorithmic Applications: The Alcalá Lectures|volume=152|series=Mathematical surveys and monographs|first1=János|last1=Pach|author1-link=János Pach|first2=Micha|last2=Sharir|date=2008 |author2-link=Micha Sharir|publisher=American Mathematical Society|isbn=978-0-8218-7533-9|pages=93–94|url=https://books.google.com/books?id=-fguzNaYoqcC&pg=PA93}}.</ref>

=== Citation graphs ===
In a [[citation graph]] the vertices are documents with a single publication date. The edges represent the citations from the bibliography of one document to other necessarily earlier documents. The classic example comes from the citations between academic papers as pointed out in the 1965 article "Networks of Scientific Papers"<ref>{{citation | last = Price | first = Derek J. de Solla | date = July 30, 1965 | doi = 10.1126/science.149.3683.510 | issue = 3683 | journal = [[Science (journal)|Science]] | pages = 510–515 | pmid = 14325149 | title = Networks of Scientific Papers | url = http://garfield.library.upenn.edu/papers/pricenetworks1965.pdf | volume = 149| bibcode = 1965Sci...149..510D }}.</ref> by [[Derek J. de Solla Price]] who went on to produce the first model of a citation network, the [[Price's model|Price model]].<ref>{{citation | last = Price | first = Derek J. de Solla | date = 1976 | doi = 10.1002/asi.4630270505 |  journal = [[Journal of the American Society for Information Science]] | pages = 292–306 | volume = 27 |title = A general theory of bibliometric and other cumulative advantage processes | issue = 5 | s2cid = 8536863 }}.</ref> In this case the [[Citation impact|citation count]] of a paper is just the in-degree of the corresponding vertex of the citation network. This is an important measure in [[citation analysis]]. [[Judgment (law)|Court judgements]] provide another example as judges support their conclusions in one case by recalling other earlier decisions made in previous cases. A final example is provided by patents which must refer to earlier [[prior art]], earlier patents which are relevant to the current patent claim. By taking the special properties of directed acyclic graphs into account, one can analyse citation networks with techniques not available when analysing the general graphs considered in many studies using [[Network Science|network analysis]]. For instance [[#Transitive closure and transitive reduction|transitive reduction]] gives new insights into the citation distributions found in different applications highlighting clear differences in the mechanisms creating citations networks in different contexts.<ref>{{citation | last1 = Clough | first1 = James R. | last2 = Gollings | first2 = Jamie | last3 = Loach | first3 = Tamar V. | last4 = Evans | first4 = Tim S. | doi = 10.1093/comnet/cnu039 | issue = 2 | journal = Journal of Complex Networks | pages = 189–203 | title = Transitive reduction of citation networks | volume = 3| arxiv = 1310.8224 | year = 2015 | s2cid = 10228152 }}.</ref> Another technique is [[main path analysis]], which traces the citation links and suggests the most significant citation chains in a given [[citation graph]].

The [[Price's model|Price model]] is too simple to be a realistic model of a [[citation graph|citation network]] but it is simple enough to allow for analytic solutions for some of its properties. Many of these can be found by using results derived from the undirected version of the [[Price's model|Price model]], the [[Barabási–Albert model]]. However, since [[Price's model]] gives a directed acyclic graph, it is a useful model when looking for analytic calculations of properties unique to directed acyclic graphs. For instance,
the length of the longest path, from the n-th node added to the network to the first node in the network, scales as<ref name="ECV">{{citation | last1=Evans | first1=T.S. |last2=Calmon | first2=L. |last3=Vasiliauskaite | first3=V. | title=The Longest Path in the Price Model | journal=Scientific Reports  | volume=10 | date=2020 | issue=1 | doi=10.1038/s41598-020-67421-8 | pages=10503| pmid=32601403 | pmc=7324613 | arxiv=1903.03667 | bibcode=2020NatSR..1010503E }}</ref> <math>\ln(n)</math>.

=== Data compression ===
Directed acyclic graphs may also be used as a [[data compression|compact representation]] of a collection of sequences. In this type of application, one finds a DAG in which the paths form the given sequences. When many of the sequences share the same subsequences, these shared subsequences can be represented by a shared part of the DAG, allowing the representation to use less space than it would take to list out all of the sequences separately. For example, the [[Deterministic acyclic finite state automaton|directed acyclic word graph]] is a [[data structure]] in computer science formed by a directed acyclic graph with a single source and with edges labeled by letters or symbols; the paths from the source to the sinks in this graph represent a set of [[String (computer science)|strings]], such as English words.<ref>{{citation | first1=Maxime | last1=Crochemore | first2=Renaud | last2=Vérin | contribution=Direct construction of compact directed acyclic word graphs | series=Lecture Notes in Computer Science | publisher=Springer | title=Combinatorial Pattern Matching | volume=1264 | year=1997 | pages=116–129 | doi=10.1007/3-540-63220-4_55 | isbn=978-3-540-63220-7 | citeseerx=10.1.1.53.6273 | s2cid=17045308 }}.</ref> Any set of sequences can be represented as paths in a tree, by forming a tree vertex for every prefix of a sequence and making the parent of one of these vertices represent the sequence with one fewer element; the tree formed in this way for a set of strings is called a [[trie]]. A directed acyclic word graph saves space over a trie by allowing paths to diverge and rejoin, so that a set of words with the same possible suffixes can be represented by a single tree vertex.<ref>{{citation|title=Applied Combinatorics on Words|volume=105|series=Encyclopedia of Mathematics and its Applications|first=M.|last=Lothaire|author-link=M. Lothaire|publisher=Cambridge University Press|year=2005|isbn=9780521848022|page=18|url=https://books.google.com/books?id=fpLUNkj1T1EC&pg=PA18}}.</ref>

The same idea of using a DAG to represent a family of paths occurs in the [[binary decision diagram]],<ref>{{citation|first=C. Y.|last=Lee|title=Representation of switching circuits by binary-decision programs|journal=Bell System Technical Journal|volume=38|issue=4|pages=985–999|year=1959|doi=10.1002/j.1538-7305.1959.tb01585.x}}.</ref><ref>{{citation|first=Sheldon B.|last=Akers|doi=10.1109/TC.1978.1675141|title=Binary decision diagrams|journal=IEEE Transactions on Computers|volume=C-27|issue=6|pages=509–516|year=1978|s2cid=21028055}}.</ref> a DAG-based data structure for representing binary functions. In a binary decision diagram, each non-sink vertex is labeled by the name of a binary variable, and each sink and each edge is labeled by a 0 or 1. The function value for any [[truth assignment]] to the variables is the value at the sink found by following a path, starting from the single source vertex, that at each non-sink vertex follows the outgoing edge labeled with the value of that vertex's variable. Just as directed acyclic word graphs can be viewed as a compressed form of {{not a typo|tries}}, binary decision diagrams can be viewed as compressed forms of [[decision tree]]s that save space by allowing paths to rejoin when they agree on the results of all remaining decisions.<ref>{{citation
 | last1 = Friedman | first1 = S. J.
 | last2 = Supowit | first2 = K. J.
 | contribution = Finding the optimal variable ordering for binary decision diagrams
 | doi = 10.1145/37888.37941
 | isbn = 978-0-8186-0781-3
 | location = New York, NY, USA
 | pages = 348–356
 | publisher = ACM
 | title = Proc. 24th ACM/IEEE Design Automation Conference (DAC '87)
 | year = 1987| s2cid = 14796451
 }}.</ref>