|  | ========================= | 
|  | Dependence Graphs in LLVM | 
|  | ========================= | 
|  |  | 
|  | .. contents:: | 
|  | :local: | 
|  |  | 
|  | Introduction | 
|  | ============ | 
|  | Dependence graphs are useful tools in compilers for analyzing relationships | 
|  | between various program elements to help guide optimizations. The ideas | 
|  | behind these graphs are described in papers [1]_ and [2]_. | 
|  |  | 
|  | The implementation of these ideas in LLVM may be slightly different than | 
|  | what is mentioned in the papers. These differences are documented in | 
|  | the `implementation details <implementation-details_>`_. | 
|  |  | 
|  | .. _DataDependenceGraph: | 
|  |  | 
|  | Data Dependence Graph | 
|  | ===================== | 
|  | In its simplest form the Data Dependence Graph (or DDG) represents data | 
|  | dependencies between individual instructions. Each node in such a graph | 
|  | represents a single instruction and is referred to as an "atomic" node. | 
|  | It is also possible to combine some atomic nodes that have a simple | 
|  | def-use dependency between them into larger nodes that contain multiple- | 
|  | instructions. | 
|  |  | 
|  | As described in [1]_ the DDG uses graph abstraction to group nodes | 
|  | that are part of a strongly connected component of the graph | 
|  | into special nodes called pi-blocks. pi-blocks represent cycles of data | 
|  | dependency that prevent reordering transformations. Since any strongly | 
|  | connected component of the graph is a maximal subgraph of all the nodes | 
|  | that form a cycle, pi-blocks are at most one level deep. In other words, | 
|  | no pi-blocks are nested inside another pi-block, resulting in a | 
|  | hierarchical representation that is at most one level deep. | 
|  |  | 
|  |  | 
|  | For example, consider the following: | 
|  |  | 
|  | .. code-block:: c++ | 
|  |  | 
|  | for (int i = 1; i < n; i++) { | 
|  | b[i] = c[i] + b[i-1]; | 
|  | } | 
|  |  | 
|  | This code contains a statement that has a loop carried dependence on | 
|  | itself creating a cycle in the DDG. The figure below illustrates | 
|  | how the cycle of dependency is carried through multiple def-use relations | 
|  | and a memory access dependency. | 
|  |  | 
|  | .. image:: cycle.png | 
|  |  | 
|  | The DDG corresponding to this example would have a pi-block that contains | 
|  | all the nodes participating in the cycle, as shown below: | 
|  |  | 
|  | .. image:: cycle_pi.png | 
|  |  | 
|  | Program Dependence Graph | 
|  | ======================== | 
|  |  | 
|  | The Program Dependence Graph (or PDG) has a similar structure as the | 
|  | DDG, but it is capable of representing both data dependencies and | 
|  | control-flow dependencies between program elements such as | 
|  | instructions, groups of instructions, basic blocks or groups of | 
|  | basic blocks. | 
|  |  | 
|  | High-Level Design | 
|  | ================= | 
|  |  | 
|  | The DDG and the PDG are both directed graphs and they extend the | 
|  | ``DirectedGraph`` class. Each implementation extends its corresponding | 
|  | node and edge types resulting in the inheritance relationship depicted | 
|  | in the UML diagram below: | 
|  |  | 
|  | .. image:: uml_nodes_and_edges.png | 
|  |  | 
|  | Graph Construction | 
|  | ------------------ | 
|  |  | 
|  | The graph build algorithm considers dependencies between elements of | 
|  | a given set of instructions or basic blocks. Any dependencies coming | 
|  | into or going out of instructions that do not belong to that range | 
|  | are ignored. The steps in the build algorithm for the DDG are very | 
|  | similar to the steps in the build algorithm for the PDG. As such, | 
|  | one of the design goals is to reuse the build algorithm code to | 
|  | allow creation of both DDG and PDG representations while allowing | 
|  | the two implementations to define their own distinct and independent | 
|  | node and edge types. This is achieved by using the well-known builder | 
|  | design pattern to isolate the construction of the dependence graph | 
|  | from its concrete representation. | 
|  |  | 
|  | The following UML diagram depicts the overall structure of the design | 
|  | pattern as it applies to the dependence graph implementation. | 
|  |  | 
|  | .. image:: uml_builder_pattern.png | 
|  |  | 
|  | Notice that the common code for building the two types of graphs are | 
|  | provided in the ``DependenceGraphBuilder`` class, while the ``DDGBuilder`` | 
|  | and ``PDGBuilder`` control some aspects of how the graph is constructed | 
|  | by the way of overriding virtual methods defined in ``DependenceGraphBuilder``. | 
|  |  | 
|  | Note also that the steps and the names used in this diagram are for | 
|  | illustrative purposes and may be different from those in the actual | 
|  | implementation. | 
|  |  | 
|  | Design Trade-offs | 
|  | ----------------- | 
|  |  | 
|  | Advantages: | 
|  | ^^^^^^^^^^^ | 
|  | - Builder allows graph construction code to be reused for DDG and PDG. | 
|  | - Builder allows us to create DDG and PDG as separate graphs. | 
|  | - DDG nodes and edges are completely disjoint from PDG nodes and edges allowing them to change easily and independently. | 
|  |  | 
|  | Disadvantages: | 
|  | ^^^^^^^^^^^^^^ | 
|  | - Builder may be perceived as over-engineering at first. | 
|  | - There are some similarities between DDG nodes and edges compared to PDG nodes and edges, but there is little reuse of the class definitions. | 
|  |  | 
|  | - This is tolerable given that the node and edge types are fairly simple and there is little code reuse opportunity anyway. | 
|  |  | 
|  |  | 
|  | .. _implementation-details: | 
|  |  | 
|  | Implementation Details | 
|  | ====================== | 
|  |  | 
|  | The current implementation of DDG differs slightly from the dependence | 
|  | graph described in [1]_ in the following ways: | 
|  |  | 
|  | 1. The graph nodes in the paper represent three main program components, namely *assignment statements*, *for loop headers* and *while loop headers*. In this implementation, DDG nodes naturally represent LLVM IR instructions. An assignment statement in this implementation typically involves a node representing the ``store`` instruction along with a number of individual nodes computing the right-hand-side of the assignment that connect to the ``store`` node via a def-use edge.  The loop header instructions are not represented as special nodes in this implementation because they have limited uses and can be easily identified, for example, through ``LoopAnalysis``. | 
|  | 2. The paper describes five types of dependency edges between nodes namely *loop dependency*, *flow-*, *anti-*, *output-*, and *input-* dependencies. In this implementation *memory* edges represent the *flow-*, *anti-*, *output-*, and *input-* dependencies. However, *loop dependencies* are not made explicit, because they mainly represent association between a loop structure and the program elements inside the loop and this association is fairly obvious in LLVM IR itself. | 
|  | 3. The paper describes two types of pi-blocks; *recurrences* whose bodies are SCCs and *IN* nodes whose bodies are not part of any SCC. In this implementation, pi-blocks are only created for *recurrences*. *IN* nodes remain as simple DDG nodes in the graph. | 
|  |  | 
|  |  | 
|  | References | 
|  | ---------- | 
|  | .. [1] "D. J. Kuck, R. H. Kuhn, D. A. Padua, B. Leasure, and M. Wolfe (1981). DEPENDENCE GRAPHS AND COMPILER OPTIMIZATIONS." | 
|  | .. [2] "J. FERRANTE (IBM), K. J. OTTENSTEIN (Michigan Technological University) and JOE D. WARREN (Rice University), 1987. The Program Dependence Graph and Its Use in Optimization." |