xref: /llvm-project/flang/docs/ControlFlowGraph.md (revision b7ff03206d668cd5a620a9d4e1b22ea112ed56e3)
1<!--===- docs/ControlFlowGraph.md
2
3   Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4   See https://llvm.org/LICENSE.txt for license information.
5   SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6
7-->
8
9# Control Flow Graph
10
11```{contents}
12---
13local:
14---
15```
16
17## Concept
18After a Fortran subprogram has been parsed, its names resolved, and all its
19semantic constraints successfully checked, the parse tree of its
20executable part is translated into another abstract representation,
21namely the _control flow graph_ described in this note.
22
23This second representation of the subprogram's executable part is
24suitable for analysis and incremental modification as the subprogram
25is readied for code generation.
26Many high-level Fortran features are implemented by rewriting portions
27of a subprogram's control flow graph in place.
28
29### Control Flow Graph
30A _control flow graph_ is a collection of simple (_i.e.,_ "non-extended")
31basic _blocks_ that comprise straight-line sequences of _actions_ with a
32single entry point and a single exit point, and a collection of
33directed flow _edges_ (or _arcs_) denoting all possible transitions of
34control flow that may take place during execution from the end of
35one basic block to the beginning of another (or itself).
36
37A block that has multiple distinct successors in the flow of control
38must end with an action that selects its successor.
39
40The sequence of actions that constitutes a basic block may
41include references to user and library procedures.
42Subprogram calls with implicit control flow afterwards, namely
43alternate returns and `END=`/`ERR=` labels on input/output,
44will be lowered in translation to a representation that materializes
45that control flow into something similar to a computed `GO TO` or
46C language `switch` statement.
47
48For convenience in optimization and to simplify the implementation of
49data flow confluence functions, we may choose to maintain the
50property that each flow arc is the sole outbound arc emanating from
51its originating block, the sole inbound arc arriving at its destination,
52or both.
53Empty blocks would inserted to "split" arcs when necessary to maintain this
54invariant property.
55
56Fortran subprograms (other than internal subprograms) can have multiple
57entry points by using the obsolescent `ENTRY` statement.
58We will implement such subprograms by constructing a union
59of their dummy argument lists and using it as part of the definition
60of a new subroutine or function that can be called by each of
61the entry points, which are then all converted into wrapper routines that
62pass a selector value as an additional argument to drive a `switch` on entry
63to the new subprogram.
64
65This transformation ensures that every subprogram's control
66flow graph has a well-defined `START` node.
67
68Statement labels can be used in Fortran on any statement, but only
69the labels that decorate legal destinations of `GO TO` statements
70need to be implemented in the control flow graph.
71Specifically, non-executable statements like `DATA`, `NAMELIST`, and
72`FORMAT` statements will be extracted into data initialization
73records before or during the construction of the control flow
74graph, and will survive only as synonyms for `CONTINUE`.
75
76Nests of multiple labeled `DO` loops that terminate on the same
77label will be have that label rewritten so that `GO TO` within
78the loop nest will arrive at the copy that most closely nests
79the context.
80The Fortran standard does not require us to do this, but XLF
81(at least) works this way.
82
83### Expressions and Statements (Operations and Actions)
84Expressions are trees, not DAGs, of intrinsic operations,
85resolved function references, constant literals, and
86data designators.
87
88Expression nodes are represented in the compiler in a type-safe manner.
89There is a distinct class or class template for every category of
90intrinsic type, templatized over its supported kind type parameter values.
91
92Operands are storage-owning indirections to other instances
93of `Expression`, instances of constant values, and to representations
94of data and function references.
95These indirections are not nullable apart from the situation in which
96the operands of an expression are being removed for use elsewhere before
97the expression is destructed.
98
99The ranks and the extents of the shapes of the results of expressions
100are explicit for constant arrays and recoverable by analysis otherwise.
101
102Parenthesized subexpressions are scrupulously preserved in accordance with
103the Fortran standard.
104
105The expression tree is meant to be a representation that is
106as equally well suited for use in the symbol table (e.g., for
107a bound of an explicit shape array) as it is for an action
108in a basic block of the control flow graph (e.g., the right
109hand side of an assignment statement).
110
111Each basic block comprises a linear sequence of _actions_.
112These are represented as a doubly-linked list so that insertion
113and deletion can be done in constant time.
114
115Only the last action in a basic block can represent a change
116to the flow of control.
117
118### Scope Transitions
119Some of the various scopes of the symbol table are visible in the control flow
120graph as `SCOPE ENTRY` and `SCOPE EXIT` actions.
121`SCOPE ENTRY` actions are unique for their corresponding scopes,
122while `SCOPE EXIT` actions need not be so.
123It must be the case that
124any flow of control within the subprogram will enter only scopes that are
125not yet active, and exit only the most recently entered scope that has not
126yet been deactivated; i.e., when modeled by a push-down stack that is
127pushed by each traversal of a `SCOPE ENTRY` action,
128the entries of the stack are always distinct, only the scope at
129the top of the stack is ever popped by `SCOPE EXIT`, and the stack is empty
130when the subprogram terminates.
131Further, any references to resolved symbols must be to symbols whose scopes
132are active.
133
134The `DEALLOCATE` actions and calls to `FINAL` procedures implied by scoped
135lifetimes will be explicit in the sequence of actions in the control flow
136graph.
137
138Parallel regions might be partially represented by scopes, or by explicit
139operations similar to the scope entry and exit operations.
140
141### Data Flow Representation
142The subprogram text will be in static single assignment form by the time the
143subprogram arrives at the bridge to the LLVM IR builder.
144Merge points are actions at the heads of basic blocks whose operands
145are definition points; definition points are actions at the ends of
146basic blocks whose operands are expression trees (which may refer to
147merge points).
148
149### Rewriting Transformations
150
151#### I/O
152#### Dynamic allocation
153#### Array constructors
154
155#### Derived type initialization, deallocation, and finalization
156The machinery behind the complicated semantics of Fortran's derived types
157and `ALLOCATABLE` objects will be implemented in large part by the run time
158support library.
159
160#### Actual argument temporaries
161#### Array assignments, `WHERE`, and `FORALL`
162
163Array operations have shape.
164
165`WHERE` masks have shape.
166Their effects on array operations are by means of explicit `MASK` operands that
167are part of array assignment operations.
168
169#### Intrinsic function and subroutine calls
170