1932aae77SSourabh Singh Tomar<!--===- docs/Semantics.md 2932aae77SSourabh Singh Tomar 3932aae77SSourabh Singh Tomar Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. 4932aae77SSourabh Singh Tomar See https://llvm.org/LICENSE.txt for license information. 5932aae77SSourabh Singh Tomar SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception 6932aae77SSourabh Singh Tomar 7932aae77SSourabh Singh Tomar--> 8932aae77SSourabh Singh Tomar 9eaff2004Ssameeran joshi# Semantic Analysis 10eaff2004Ssameeran joshi 11*b7ff0320Scor3ntin```{contents} 12*b7ff0320Scor3ntin--- 13*b7ff0320Scor3ntinlocal: 14*b7ff0320Scor3ntin--- 15271a7bb1SRichard Barton``` 16271a7bb1SRichard Barton 17eaff2004Ssameeran joshiThe semantic analysis pass determines if a syntactically correct Fortran 18eaff2004Ssameeran joshiprogram is is legal by enforcing the constraints of the language. 19eaff2004Ssameeran joshi 20eaff2004Ssameeran joshiThe input is a parse tree with a `Program` node at the root; 21eaff2004Ssameeran joshiand a "cooked" character stream, a contiguous stream of characters 22eaff2004Ssameeran joshicontaining a normalized form of the Fortran source. 23eaff2004Ssameeran joshi 24eaff2004Ssameeran joshiThe semantic analysis pass takes a parse tree for a syntactically 25eaff2004Ssameeran joshicorrect Fortran program and determines whether it is legal by enforcing 26eaff2004Ssameeran joshithe constraints of the language. 27eaff2004Ssameeran joshi 28eaff2004Ssameeran joshiIf the program is not legal, the results of the semantic pass will be a list of 29eaff2004Ssameeran joshierrors associated with the program. 30eaff2004Ssameeran joshi 31eaff2004Ssameeran joshiIf the program is legal, the semantic pass will produce a (possibly modified) 32eaff2004Ssameeran joshiparse tree for the semantically correct program with each name mapped to a symbol 33eaff2004Ssameeran joshiand each expression fully analyzed. 34eaff2004Ssameeran joshi 35eaff2004Ssameeran joshiAll user errors are detected either prior to or during semantic analysis. 36eaff2004Ssameeran joshiAfter it completes successfully the program should compile with no error messages. 37eaff2004Ssameeran joshiThere may still be warnings or informational messages. 38eaff2004Ssameeran joshi 39eaff2004Ssameeran joshi## Phases of Semantic Analysis 40eaff2004Ssameeran joshi 41eaff2004Ssameeran joshi1. [Validate labels](#validate-labels) - 42eaff2004Ssameeran joshi Check all constraints on labels and branches 43eaff2004Ssameeran joshi2. [Rewrite DO loops](#rewrite-do-loops) - 44eaff2004Ssameeran joshi Convert all occurrences of `LabelDoStmt` to `DoConstruct`. 45eaff2004Ssameeran joshi3. [Name resolution](#name-resolution) - 46eaff2004Ssameeran joshi Analyze names and declarations, build a tree of Scopes containing Symbols, 47eaff2004Ssameeran joshi and fill in the `Name::symbol` data member in the parse tree 48eaff2004Ssameeran joshi4. [Rewrite parse tree](#rewrite-parse-tree) - 49eaff2004Ssameeran joshi Fix incorrect parses based on symbol information 50eaff2004Ssameeran joshi5. [Expression analysis](#expression-analysis) - 51eaff2004Ssameeran joshi Analyze all expressions in the parse tree and fill in `Expr::typedExpr` and 52eaff2004Ssameeran joshi `Variable::typedExpr` with analyzed expressions; fix incorrect parses 53eaff2004Ssameeran joshi based on the result of this analysis 54eaff2004Ssameeran joshi6. [Statement semantics](#statement-semantics) - 55eaff2004Ssameeran joshi Perform remaining semantic checks on the execution parts of subprograms 56eaff2004Ssameeran joshi7. [Write module files](#write-module-files) - 57eaff2004Ssameeran joshi If no errors have occurred, write out `.mod` files for modules and submodules 58eaff2004Ssameeran joshi 59eaff2004Ssameeran joshiIf phase 1 or phase 2 encounter an error on any of the program units, 60eaff2004Ssameeran joshicompilation terminates. Otherwise, phases 3-6 are all performed even if 61eaff2004Ssameeran joshierrors occur. 62eaff2004Ssameeran joshiModule files are written (phase 7) only if there are no errors. 63eaff2004Ssameeran joshi 64eaff2004Ssameeran joshi### Validate labels 65eaff2004Ssameeran joshi 66eaff2004Ssameeran joshiPerform semantic checks related to labels and branches: 67eaff2004Ssameeran joshi- check that any labels that are referenced are defined and in scope 68eaff2004Ssameeran joshi- check branches into loop bodies 69eaff2004Ssameeran joshi- check that labeled `DO` loops are properly nested 70eaff2004Ssameeran joshi- check labels in data transfer statements 71eaff2004Ssameeran joshi 72eaff2004Ssameeran joshi### Rewrite DO loops 73eaff2004Ssameeran joshi 74eaff2004Ssameeran joshiThis phase normalizes the parse tree by removing all unstructured `DO` loops 75eaff2004Ssameeran joshiand replacing them with `DO` constructs. 76eaff2004Ssameeran joshi 77eaff2004Ssameeran joshi### Name resolution 78eaff2004Ssameeran joshi 79eaff2004Ssameeran joshiThe name resolution phase walks the parse tree and constructs the symbol table. 80eaff2004Ssameeran joshi 81eaff2004Ssameeran joshiThe symbol table consists of a tree of `Scope` objects rooted at the global scope. 82eaff2004Ssameeran joshiThe global scope is owned by the `SemanticsContext` object. 83eaff2004Ssameeran joshiIt contains a `Scope` for each program unit in the compilation. 84eaff2004Ssameeran joshi 85eaff2004Ssameeran joshiEach `Scope` in the scope tree contains child scopes representing other scopes 86eaff2004Ssameeran joshilexically nested in it. 87eaff2004Ssameeran joshiEach `Scope` also contains a map of `CharBlock` to `Symbol` representing names 88eaff2004Ssameeran joshideclared in that scope. (All names in the symbol table are represented as 89eaff2004Ssameeran joshi`CharBlock` objects, i.e. as substrings of the cooked character stream.) 90eaff2004Ssameeran joshi 91eaff2004Ssameeran joshiAll `Symbol` objects are owned by the symbol table data structures. 92eaff2004Ssameeran joshiThey should be accessed as `Symbol *` or `Symbol &` outside of the symbol 93eaff2004Ssameeran joshitable classes as they can't be created, copied, or moved. 94eaff2004Ssameeran joshiThe `Symbol` class has functions and data common across all symbols, and a 95eaff2004Ssameeran joshi`details` field that contains more information specific to that type of symbol. 96eaff2004Ssameeran joshiMany symbols also have types, represented by `DeclTypeSpec`. 97eaff2004Ssameeran joshiTypes are also owned by scopes. 98eaff2004Ssameeran joshi 99eaff2004Ssameeran joshiName resolution happens on the parse tree in this order: 100eaff2004Ssameeran joshi1. Process the specification of a program unit: 101eaff2004Ssameeran joshi 1. Create a new scope for the unit 102eaff2004Ssameeran joshi 2. Create a symbol for each contained subprogram containing just the name 103eaff2004Ssameeran joshi 3. Process the opening statement of the unit (`ModuleStmt`, `FunctionStmt`, etc.) 104eaff2004Ssameeran joshi 4. Process the specification part of the unit 105eaff2004Ssameeran joshi2. Apply the same process recursively to nested subprograms 106eaff2004Ssameeran joshi3. Process the execution part of the program unit 107eaff2004Ssameeran joshi4. Process the execution parts of nested subprograms recursively 108eaff2004Ssameeran joshi 109eaff2004Ssameeran joshiAfter the completion of this phase, every `Name` corresponds to a `Symbol` 110eaff2004Ssameeran joshiunless an error occurred. 111eaff2004Ssameeran joshi 112eaff2004Ssameeran joshi### Rewrite parse tree 113eaff2004Ssameeran joshi 114eaff2004Ssameeran joshiThe parser cannot build a completely correct parse tree without symbol information. 115eaff2004Ssameeran joshiThis phase corrects mis-parses based on symbols: 116eaff2004Ssameeran joshi- Array element assignments may be parsed as statement functions: `a(i) = ...` 117eaff2004Ssameeran joshi- Namelist group names without `NML=` may be parsed as format expressions 118eaff2004Ssameeran joshi- A file unit number expression may be parsed as a character variable 119eaff2004Ssameeran joshi 120eaff2004Ssameeran joshiThis phase also produces an internal error if it finds a `Name` that does not 121eaff2004Ssameeran joshihave its `symbol` data member filled in. This error is suppressed if other 122eaff2004Ssameeran joshierrors have occurred because in that case a `Name` corresponding to an erroneous 123eaff2004Ssameeran joshisymbol may not be resolved. 124eaff2004Ssameeran joshi 125eaff2004Ssameeran joshi### Expression analysis 126eaff2004Ssameeran joshi 127eaff2004Ssameeran joshiExpressions that occur in the specification part are analyzed during name 128eaff2004Ssameeran joshiresolution, for example, initial values, array bounds, type parameters. 129eaff2004Ssameeran joshiAny remaining expressions are analyzed in this phase. 130eaff2004Ssameeran joshi 131eaff2004Ssameeran joshiFor each `Variable` and top-level `Expr` (i.e. one that is not nested below 132eaff2004Ssameeran joshianother `Expr` in the parse tree) the analyzed form of the expression is saved 133eaff2004Ssameeran joshiin the `typedExpr` data member. After this phase has completed, the analyzed 134eaff2004Ssameeran joshiexpression can be accessed using `semantics::GetExpr()`. 135eaff2004Ssameeran joshi 136eaff2004Ssameeran joshiThis phase also corrects mis-parses based on the result of expression analysis: 137eaff2004Ssameeran joshi- An expression like `a(b)` is parsed as a function reference but may need 138eaff2004Ssameeran joshi to be rewritten to an array element reference (if `a` is an object entity) 139eaff2004Ssameeran joshi or to a structure constructor (if `a` is a derive type) 140eaff2004Ssameeran joshi- An expression like `a(b:c)` is parsed as an array section but may need to be 141eaff2004Ssameeran joshi rewritten as a substring if `a` is an object with type CHARACTER 142eaff2004Ssameeran joshi 143eaff2004Ssameeran joshi### Statement semantics 144eaff2004Ssameeran joshi 145eaff2004Ssameeran joshiMultiple independent checkers driven by the `SemanticsVisitor` framework 146eaff2004Ssameeran joshiperform the remaining semantic checks. 147eaff2004Ssameeran joshiBy this phase, all names and expressions that can be successfully resolved 148eaff2004Ssameeran joshihave been. But there may be names without symbols or expressions without 149eaff2004Ssameeran joshianalyzed form if errors occurred earlier. 150eaff2004Ssameeran joshi 151641ede93Speter klausler### Initialization processing 152641ede93Speter klausler 153641ede93Speter klauslerFortran supports many means of specifying static initializers for variables, 154641ede93Speter klauslerobject pointers, and procedure pointers, as well as default initializers for 155641ede93Speter klauslerderived type object components, pointers, and type parameters. 156641ede93Speter klausler 157641ede93Speter klauslerNon-pointer static initializers of variables and named constants are 158641ede93Speter klauslerscanned, analyzed, folded, scalar-expanded, and validated as they are 159641ede93Speter klauslertraversed during declaration processing in name resolution. 160641ede93Speter klauslerSo are the default initializers of non-pointer object components in 161641ede93Speter klauslernon-parameterized derived types. 162641ede93Speter klauslerName constant arrays with implied shapes take their actual shape from 163641ede93Speter klauslerthe initialization expression. 164641ede93Speter klausler 165641ede93Speter klauslerDefault initializers of non-pointer components and type parameters 166641ede93Speter klauslerin distinct parameterized 167641ede93Speter klauslerderived type instantiations are similarly processed as those instances 168641ede93Speter klauslerare created, as their expressions may depend on the values of type 169641ede93Speter klauslerparameters. 170641ede93Speter klauslerError messages produced during parameterized derived type instantiation 171641ede93Speter klauslerare decorated with contextual attachments that point to the declarations 172641ede93Speter klausleror other type specifications that caused the instantiation. 173641ede93Speter klausler 174641ede93Speter klauslerStatic initializations in `DATA` statements are collected, validated, 175641ede93Speter klauslerand converted into static initialization in the symbol table, as if 176641ede93Speter klauslerthe initialized objects had used the newer style of static initialization 177641ede93Speter klauslerin their entity declarations. 178641ede93Speter klausler 179641ede93Speter klauslerAll statically initialized pointers, and default component initializers for 180641ede93Speter klauslerpointers, are processed late in name resolution after all specification parts 181641ede93Speter klauslerhave been traversed. 182641ede93Speter klauslerThis allows for forward references even in the presence of `IMPLICIT NONE`. 183641ede93Speter klauslerObject pointer initializers in parameterized derived type instantiations are 184641ede93Speter klausleralso cloned and folded at this late stage. 185641ede93Speter klauslerValidation of pointer initializers takes place later in declaration 186641ede93Speter klauslerchecking (below). 187641ede93Speter klausler 188641ede93Speter klausler### Declaration checking 189641ede93Speter klausler 190641ede93Speter klauslerWhenever possible, the enforcement of constraints and "shalls" pertaining to 191641ede93Speter klauslerproperties of symbols is deferred to a single read-only pass over the symbol table 192641ede93Speter klauslerthat takes place after all name resolution and typing is complete. 193641ede93Speter klausler 194eaff2004Ssameeran joshi### Write module files 195eaff2004Ssameeran joshi 196eaff2004Ssameeran joshiSeparate compilation information is written out on successful compilation 197eaff2004Ssameeran joshiof modules and submodules. These are used as input to name resolution 198eaff2004Ssameeran joshiin program units that `USE` the modules. 199eaff2004Ssameeran joshi 200eaff2004Ssameeran joshiModule files are stripped down Fortran source for the module. 201eaff2004Ssameeran joshiParts that aren't needed to compile dependent program units (e.g. action statements) 202eaff2004Ssameeran joshiare omitted. 203eaff2004Ssameeran joshi 204eaff2004Ssameeran joshiThe module file for module `m` is named `m.mod` and the module file for 205eaff2004Ssameeran joshisubmodule `s` of module `m` is named `m-s.mod`. 206