1<!--===- docs/FortranForCProgrammers.md 2 3 Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. 4 See https://llvm.org/LICENSE.txt for license information. 5 SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception 6 7--> 8 9# Fortran For C Programmers 10 11```{contents} 12--- 13local: 14--- 15``` 16 17This note is limited to essential information about Fortran so that 18a C or C++ programmer can get started more quickly with the language, 19at least as a reader, and avoid some common pitfalls when starting 20to write or modify Fortran code. 21Please see other sources to learn about Fortran's rich history, 22current applications, and modern best practices in new code. 23 24## Know This At Least 25 26* There have been many implementations of Fortran, often from competing 27 vendors, and the standard language has been defined by U.S. and 28 international standards organizations. The various editions of 29 the standard are known as the '66, '77, '90, '95, 2003, 2008, and 30 (now) 2018 standards. 31* Forward compatibility is important. Fortran has outlasted many 32 generations of computer systems hardware and software. Standard 33 compliance notwithstanding, Fortran programmers generally expect that 34 code that has compiled successfully in the past will continue to 35 compile and work indefinitely. The standards sometimes designate 36 features as being deprecated, obsolescent, or even deleted, but that 37 can be read only as discouraging their use in new code -- they'll 38 probably always work in any serious implementation. 39* Fortran has two source forms, which are typically distinguished by 40 filename suffixes. `foo.f` is old-style "fixed-form" source, and 41 `foo.f90` is new-style "free-form" source. All language features 42 are available in both source forms. Neither form has reserved words 43 in the sense that C does. Spaces are not required between tokens 44 in fixed form, and case is not significant in either form. 45* Variable declarations are optional by default. Variables whose 46 names begin with the letters `I` through `N` are implicitly 47 `INTEGER`, and others are implicitly `REAL`. These implicit typing 48 rules can be changed in the source. 49* Fortran uses parentheses in both array references and function calls. 50 All arrays must be declared as such; other names followed by parenthesized 51 expressions are assumed to be function calls. 52* Fortran has a _lot_ of built-in "intrinsic" functions. They are always 53 available without a need to declare or import them. Their names reflect 54 the implicit typing rules, so you will encounter names that have been 55 modified so that they have the right type (e.g., `AIMAG` has a leading `A` 56 so that it's `REAL` rather than `INTEGER`). 57* The modern language has means for declaring types, data, and subprogram 58 interfaces in compiled "modules", as well as legacy mechanisms for 59 sharing data and interconnecting subprograms. 60 61## A Rosetta Stone 62 63Fortran's language standard and other documentation uses some terminology 64in particular ways that might be unfamiliar. 65 66| Fortran | English | 67| ------- | ------- | 68| Association | Making a name refer to something else | 69| Assumed | Some attribute of an argument or interface that is not known until a call is made | 70| Companion processor | A C compiler | 71| Component | Class member | 72| Deferred | Some attribute of a variable that is not known until an allocation or assignment | 73| Derived type | C++ class | 74| Dummy argument | C++ reference argument | 75| Final procedure | C++ destructor | 76| Generic | Overloaded function, resolved by actual arguments | 77| Host procedure | The subprogram that contains a nested one | 78| Implied DO | There's a loop inside a statement | 79| Interface | Prototype | 80| Internal I/O | `sscanf` and `snprintf` | 81| Intrinsic | Built-in type or function | 82| Polymorphic | Dynamically typed | 83| Processor | Fortran compiler | 84| Rank | Number of dimensions that an array has | 85| `SAVE` attribute | Statically allocated | 86| Type-bound procedure | Kind of a C++ member function but not really | 87| Unformatted | Raw binary | 88 89## Data Types 90 91There are five built-in ("intrinsic") types: `INTEGER`, `REAL`, `COMPLEX`, 92`LOGICAL`, and `CHARACTER`. 93They are parameterized with "kind" values, which should be treated as 94non-portable integer codes, although in practice today these are the 95byte sizes of the data. 96(For `COMPLEX`, the kind type parameter value is the byte size of one of the 97two `REAL` components, or half of the total size.) 98The legacy `DOUBLE PRECISION` intrinsic type is an alias for a kind of `REAL` 99that should be more precise, and bigger, than the default `REAL`. 100 101`COMPLEX` is a simple structure that comprises two `REAL` components. 102 103`CHARACTER` data also have length, which may or may not be known at compilation 104time. 105`CHARACTER` variables are fixed-length strings and they get padded out 106with space characters when not completely assigned. 107 108User-defined ("derived") data types can be synthesized from the intrinsic 109types and from previously-defined user types, much like a C `struct`. 110Derived types can be parameterized with integer values that either have 111to be constant at compilation time ("kind" parameters) or deferred to 112execution ("len" parameters). 113 114Derived types can inherit ("extend") from at most one other derived type. 115They can have user-defined destructors (`FINAL` procedures). 116They can specify default initial values for their components. 117With some work, one can also specify a general constructor function, 118since Fortran allows a generic interface to have the same name as that 119of a derived type. 120 121Last, there are "typeless" binary constants that can be used in a few 122situations, like static data initialization or immediate conversion, 123where type is not necessary. 124 125## Arrays 126 127Arrays are not types in Fortran. 128Being an array is a property of an object or function, not of a type. 129Unlike C, one cannot have an array of arrays or an array of pointers, 130although can can have an array of a derived type that has arrays or 131pointers as components. 132Arrays are multidimensional, and the number of dimensions is called 133the _rank_ of the array. 134In storage, arrays are stored such that the last subscript has the 135largest stride in memory, e.g. A(1,1) is followed by A(2,1), not A(1,2). 136And yes, the default lower bound on each dimension is 1, not 0. 137 138Expressions can manipulate arrays as multidimensional values, and 139the compiler will create the necessary loops. 140 141## Allocatables 142 143Modern Fortran programs use `ALLOCATABLE` data extensively. 144Such variables and derived type components are allocated dynamically. 145They are automatically deallocated when they go out of scope, much 146like C++'s `std::vector<>` class template instances are. 147The array bounds, derived type `LEN` parameters, and even the 148type of an allocatable can all be deferred to run time. 149(If you really want to learn all about modern Fortran, I suggest 150that you study everything that can be done with `ALLOCATABLE` data, 151and follow up all the references that are made in the documentation 152from the description of `ALLOCATABLE` to other topics; it's a feature 153that interacts with much of the rest of the language.) 154 155## I/O 156 157Fortran's input/output features are built into the syntax of the language, 158rather than being defined by library interfaces as in C and C++. 159There are means for raw binary I/O and for "formatted" transfers to 160character representations. 161There are means for random-access I/O using fixed-size records as well as for 162sequential I/O. 163One can scan data from or format data into `CHARACTER` variables via 164"internal" formatted I/O. 165I/O from and to files uses a scheme of integer "unit" numbers that is 166similar to the open file descriptors of UNIX; i.e., one opens a file 167and assigns it a unit number, then uses that unit number in subsequent 168`READ` and `WRITE` statements. 169 170Formatted I/O relies on format specifications to map values to fields of 171characters, similar to the format strings used with C's `printf` family 172of standard library functions. 173These format specifications can appear in `FORMAT` statements and 174be referenced by their labels, in character literals directly in I/O 175statements, or in character variables. 176 177One can also use compiler-generated formatting in "list-directed" I/O, 178in which the compiler derives reasonable default formats based on 179data types. 180 181## Subprograms 182 183Fortran has both `FUNCTION` and `SUBROUTINE` subprograms. 184They share the same name space, but functions cannot be called as 185subroutines or vice versa. 186Subroutines are called with the `CALL` statement, while functions are 187invoked with function references in expressions. 188 189There is one level of subprogram nesting. 190A function, subroutine, or main program can have functions and subroutines 191nested within it, but these "internal" procedures cannot themselves have 192their own internal procedures. 193As is the case with C++ lambda expressions, internal procedures can 194reference names from their host subprograms. 195 196## Modules 197 198Modern Fortran has good support for separate compilation and namespace 199management. 200The *module* is the basic unit of compilation, although independent 201subprograms still exist, of course, as well as the main program. 202Modules define types, constants, interfaces, and nested 203subprograms. 204 205Objects from a module are made available for use in other compilation 206units via the `USE` statement, which has options for limiting the objects 207that are made available as well as for renaming them. 208All references to objects in modules are done with direct names or 209aliases that have been added to the local scope, as Fortran has no means 210of qualifying references with module names. 211 212## Arguments 213 214Functions and subroutines have "dummy" arguments that are dynamically 215associated with actual arguments during calls. 216Essentially, all argument passing in Fortran is by reference, not value. 217One may restrict access to argument data by declaring that dummy 218arguments have `INTENT(IN)`, but that corresponds to the use of 219a `const` reference in C++ and does not imply that the data are 220copied; use `VALUE` for that. 221 222When it is not possible to pass a reference to an object, or a sparse 223regular array section of an object, as an actual argument, Fortran 224compilers must allocate temporary space to hold the actual argument 225across the call. 226This is always guaranteed to happen when an actual argument is enclosed 227in parentheses. 228 229The compiler is free to assume that any aliasing between dummy arguments 230and other data is safe. 231In other words, if some object can be written to under one name, it's 232never going to be read or written using some other name in that same 233scope. 234``` 235 SUBROUTINE FOO(X,Y,Z) 236 X = 3.14159 237 Y = 2.1828 238 Z = 2 * X ! CAN BE FOLDED AT COMPILE TIME 239 END 240``` 241This is the opposite of the assumptions under which a C or C++ compiler must 242labor when trying to optimize code with pointers. 243 244## Overloading 245 246Fortran supports a form of overloading via its interface feature. 247By default, an interface is a means for specifying prototypes for a 248set of subroutines and functions. 249But when an interface is named, that name becomes a *generic* name 250for its specific subprograms, and calls via the generic name are 251mapped at compile time to one of the specific subprograms based 252on the types, kinds, and ranks of the actual arguments. 253A similar feature can be used for generic type-bound procedures. 254 255This feature can be used to overload the built-in operators and some 256I/O statements, too. 257 258## Polymorphism 259 260Fortran code can be written to accept data of some derived type or 261any extension thereof using `CLASS`, deferring the actual type to 262execution, rather than the usual `TYPE` syntax. 263This is somewhat similar to the use of `virtual` functions in c++. 264 265Fortran's `SELECT TYPE` construct is used to distinguish between 266possible specific types dynamically, when necessary. It's a 267little like C++17's `std::visit()` on a discriminated union. 268 269## Pointers 270 271Pointers are objects in Fortran, not data types. 272Pointers can point to data, arrays, and subprograms. 273A pointer can only point to data that has the `TARGET` attribute. 274Outside of the pointer assignment statement (`P=>X`) and some intrinsic 275functions and cases with pointer dummy arguments, pointers are implicitly 276dereferenced, and the use of their name is a reference to the data to which 277they point instead. 278 279Unlike C, a pointer cannot point to a pointer *per se*, nor can they be 280used to implement a level of indirection to the management structure of 281an allocatable. 282If you assign to a Fortran pointer to make it point at another pointer, 283you are making the pointer point to the data (if any) to which the other 284pointer points. 285Similarly, if you assign to a Fortran pointer to make it point to an allocatable, 286you are making the pointer point to the current content of the allocatable, 287not to the metadata that manages the allocatable. 288 289Unlike allocatables, pointers do not deallocate their data when they go 290out of scope. 291 292A legacy feature, "Cray pointers", implements dynamic base addressing of 293one variable using an address stored in another. 294 295## Preprocessing 296 297There is no standard preprocessing feature, but every real Fortran implementation 298has some support for passing Fortran source code through a variant of 299the standard C source preprocessor. 300Since Fortran is very different from C at the lexical level (e.g., line 301continuations, Hollerith literals, no reserved words, fixed form), using 302a stock modern C preprocessor on Fortran source can be difficult. 303Preprocessing behavior varies across implementations and one should not depend on 304much portability. 305Preprocessing is typically requested by the use of a capitalized filename 306suffix (e.g., "foo.F90") or a compiler command line option. 307(Since the F18 compiler always runs its built-in preprocessing stage, 308no special option or filename suffix is required.) 309 310## "Object Oriented" Programming 311 312Fortran doesn't have member functions (or subroutines) in the sense 313that C++ does, in which a function has immediate access to the members 314of a specific instance of a derived type. 315But Fortran does have an analog to C++'s `this` via *type-bound 316procedures*. 317This is a means of binding a particular subprogram name to a derived 318type, possibly with aliasing, in such a way that the subprogram can 319be called as if it were a component of the type (e.g., `X%F(Y)`) 320and receive the object to the left of the `%` as an additional actual argument, 321exactly as if the call had been written `F(X,Y)`. 322The object is passed as the first argument by default, but that can be 323changed; indeed, the same specific subprogram can be used for multiple 324type-bound procedures by choosing different dummy arguments to serve as 325the passed object. 326The equivalent of a `static` member function is also available by saying 327that no argument is to be associated with the object via `NOPASS`. 328 329There's a lot more that can be said about type-bound procedures (e.g., how they 330support overloading) but this should be enough to get you started with 331the most common usage. 332 333## Pitfalls 334 335Variable initializers, e.g. `INTEGER :: J=123`, are _static_ initializers! 336They imply that the variable is stored in static storage, not on the stack, 337and the initialized value lasts only until the variable is assigned. 338One must use an assignment statement to implement a dynamic initializer 339that will apply to every fresh instance of the variable. 340Be especially careful when using initializers in the newish `BLOCK` construct, 341which perpetuates the interpretation as static data. 342(Derived type component initializers, however, do work as expected.) 343 344If you see an assignment to an array that's never been declared as such, 345it's probably a definition of a *statement function*, which is like 346a parameterized macro definition, e.g. `A(X)=SQRT(X)**3`. 347In the original Fortran language, this was the only means for user 348function definitions. 349Today, of course, one should use an external or internal function instead. 350 351Fortran expressions don't bind exactly like C's do. 352Watch out for exponentiation with `**`, which of course C lacks; it 353binds more tightly than negation does (e.g., `-2**2` is -4), 354and it binds to the right, unlike what any other Fortran and most 355C operators do; e.g., `2**2**3` is 256, not 64. 356Logical values must be compared with special logical equivalence 357relations (`.EQV.` and `.NEQV.`) rather than the usual equality 358operators. 359 360A Fortran compiler is allowed to short-circuit expression evaluation, 361but not required to do so. 362If one needs to protect a use of an `OPTIONAL` argument or possibly 363disassociated pointer, use an `IF` statement, not a logical `.AND.` 364operation. 365In fact, Fortran can remove function calls from expressions if their 366values are not required to determine the value of the expression's 367result; e.g., if there is a `PRINT` statement in function `F`, it 368may or may not be executed by the assignment statement `X=0*F()`. 369(Well, it probably will be, in practice, but compilers always reserve 370the right to optimize better.) 371 372Unless they have an explicit suffix (`1.0_8`, `2.0_8`) or a `D` 373exponent (`3.0D0`), real literal constants in Fortran have the 374default `REAL` type -- *not* `double` as in the case in C and C++. 375If you're not careful, you can lose precision at compilation time 376from your constant values and never know it. 377