CodeGen.cpp - OpenGrok history log for /llvm-project/flang/lib/Optimizer/CodeGen/CodeGen.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-21-init
# 0b80491c	28-Jan-2025	Slava Zakharin <szakharin@nvidia.com>	[flang] Support non-index shape/shift/slice for CG box operations. (#124625) That is another problem uncovered during hlfir.reshape inlining, where the shape bits could be any integer type. This p [flang] Support non-index shape/shift/slice for CG box operations. (#124625) That is another problem uncovered during hlfir.reshape inlining, where the shape bits could be any integer type. This patch adds explicit convertions to `index` type where needed. show more ...
# afa4681c	28-Jan-2025	Abid Qadeer <haqadeer@amd.com>	[flang][debug] Add support for common blocks. (#112398) This PR adds debug support for common block in flang. As variable which are part of a common block don't have a special marker to recognize [flang][debug] Add support for common blocks. (#112398) This PR adds debug support for common block in flang. As variable which are part of a common block don't have a special marker to recognize them, we use the following check to find them. %0 = fir.address_of(@a) %1 = fir.convert %0 %2 = fir.coordinate_of %1, %c0 %3 = fir.convert %2 %4 = fircg.ext_declare %3 If the memref of a fircg.ext_declare points to a fir.coordinate_of and that in turn points to an fir.address_of (ignoring immediate fir.convert) then we assume that it is a common block variable. The fir.address_of gives us the global symbol which is the storage for common block and fir.coordinate_of provides the offset in this storage. The debug hierarchy looks like as subroutine f3 integer :: x, y common /a/ x, y end subroutine @a_ = global { ... } { ... }, !dbg !26, !dbg !28 !23 = !DISubprogram(name: "f3"...) !24 = !DICommonBlock(scope: !23, name: "a", ...) !25 = !DIGlobalVariable(name: "x", scope: !24 ...) !26 = !DIGlobalVariableExpression(var: !25, expr: !DIExpression()) !27 = !DIGlobalVariable(name: "y", scope: !24 ...) !28 = !DIGlobalVariableExpression(var: !27, expr: !DIExpression(DW_OP_plus_uconst, 4)) This required following changes: 1. Instead of using DIGlobalVariableAttr in the FusedLoc of GlobalOp, we use DIGlobalVariableExpressionAttr. This allows us the generate the DIExpression where we have the information. 2. Previously, only one DIGlobalVariableExpressionAttr could be linked to one global op. I recently removed this restriction in mlir. To make use of it, we add an ArrayAttr to the FusedLoc of a GlobalOp. This allows us to pass multiple DIGlobalVariableExpressionAttr. 3. I was depending on the name of global for the name of the common block. The name gets a '_' appended. I could not find a utility function in flang to remove it so I have to brute force it. show more ...
# 9f83c4ed	22-Jan-2025	Valentin Clement (バレンタインクレメン) <clementval@gmail.com>	[flang][cuda] Allocate descriptor in managed memory on rebox block argument (#123971) Another case where the descriptor must be allocated with the CUF runtime and not a simple alloca instruction.
# c26e1a22	22-Jan-2025	Valentin Clement (バレンタインクレメン) <clementval@gmail.com>	[flang][cuda] Allocate descriptor in managed memory when memref is a block argument (#123829)
Revision tags: llvmorg-19.1.7
# 599c7399	06-Jan-2025	Matthias Springer <me@m-sp.org>	[mlir][GPU] Add NVVM-specific `cf.assert` lowering (#120431) This commit add an NVIDIA-specific lowering of `cf.assert` to to `__assertfail`. Note: `getUniqueFormatGlobalName`, `getOrCreateForma [mlir][GPU] Add NVVM-specific `cf.assert` lowering (#120431) This commit add an NVIDIA-specific lowering of `cf.assert` to to `__assertfail`. Note: `getUniqueFormatGlobalName`, `getOrCreateFormatStringConstant` and `getOrDefineFunction` are moved to `GPUOpsLowering.h`, so that they can be reused. show more ...
# c870632e	25-Dec-2024	Matthias Springer <me@m-sp.org>	[flang] Fix some memory leaks (#121050) This commit fixes some but not all memory leaks in Flang. There are still 91 tests that fail with ASAN. - Use `mlir::OwningOpRef` instead of `std::unique_ [flang] Fix some memory leaks (#121050) This commit fixes some but not all memory leaks in Flang. There are still 91 tests that fail with ASAN. - Use `mlir::OwningOpRef` instead of `std::unique_ptr`. The latter does not free allocations of nested blocks. - Pass `ModuleOp` as value instead of reference. - Add few missing deallocations in test cases and other places. show more ...
# d36836de	23-Dec-2024	Valentin Clement (バレンタインクレメン) <clementval@gmail.com>	[flang][cuda] Create descriptor in managed memory when emboxing fir.box_addr value (#120980)
# 392651a7	22-Dec-2024	Kazu Hirata <kazu@google.com>	[flang] Migrate away from PointerUnion::{is,get} (NFC) (#120880) Note that PointerUnion::{is,get} have been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn [flang] Migrate away from PointerUnion::{is,get} (NFC) (#120880) Note that PointerUnion::{is,get} have been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T> I'm not touching PointerUnion::dyn_cast for now because it's a bit complicated; we could blindly migrate it to dyn_cast_if_present, but we should probably use dyn_cast when the operand is known to be non-null. show more ...
# e650ac16	20-Dec-2024	Valentin Clement (バレンタインクレメン) <clementval@gmail.com>	[flang][cuda][NFC] Fix typo in CUFAllocDescriptor (#120797) Missing `r` in the function name.
# 81831ef3	20-Dec-2024	Valentin Clement (バレンタインクレメン) <clementval@gmail.com>	[flang][cuda] Correctly allocate descriptor in managed memory when reboxing (#120795) Reboxing might create a new in memory descriptor. If this one was allocate with managed memory, allocate the ne [flang][cuda] Correctly allocate descriptor in managed memory when reboxing (#120795) Reboxing might create a new in memory descriptor. If this one was allocate with managed memory, allocate the new one in managed memory as well. show more ...
# 3e13acfb	20-Dec-2024	Valentin Clement (バレンタインクレメン) <clementval@gmail.com>	[flang][cuda] Make default.nonTbpDefinedIoTable compiler generated (#120686) `default.nonTbpDefinedIoTable` is a special global defined for IO that doesn't follow the mangling scheme and is then no [flang][cuda] Make default.nonTbpDefinedIoTable compiler generated (#120686) `default.nonTbpDefinedIoTable` is a special global defined for IO that doesn't follow the mangling scheme and is then not handle correctly in the `CompilerGeneratedNames` pass. Update how it is generated with doGenerated so it can be handle without special handling. Also do not generate comdat in gpu module as the current code is not handling nested module correctly. show more ...
# eb6c4197	20-Dec-2024	Matthias Springer <me@m-sp.org>	[mlir][CF] Split `cf-to-llvm` from `func-to-llvm` (#120580) Do not run `cf-to-llvm` as part of `func-to-llvm`. This commit fixes https://github.com/llvm/llvm-project/issues/70982. This commit ch [mlir][CF] Split `cf-to-llvm` from `func-to-llvm` (#120580) Do not run `cf-to-llvm` as part of `func-to-llvm`. This commit fixes https://github.com/llvm/llvm-project/issues/70982. This commit changes the way how `func.func` ops are lowered to LLVM. Previously, the signature of the entire region (i.e., entry block and all other blocks in the `func.func` op) was converted as part of the `func.func` lowering pattern. Now, only the entry block is converted. The remaining block signatures are converted together with `cf.br` and `cf.cond_br` as part of `cf-to-llvm`. All unstructured control flow is not converted as part of a single pass (`cf-to-llvm`). `func-to-llvm` no longer deals with unstructured control flow. Also add more test cases for control flow dialect ops. Note: This PR is in preparation of #120431, which adds an additional GPU-specific lowering for `cf.assert`. This was a problem because `cf.assert` used to be converted as part of `func-to-llvm`. Note for LLVM integration: If you see failures, add `-convert-cf-to-llvm` to your pass pipeline. show more ...
# e93d2266	20-Dec-2024	Valentin Clement (バレンタインクレメン) <clementval@gmail.com>	[flang][cuda] Update CompilerGeneratedNames pass to work on gpu module (#120660) - Update `CompilerGeneratedNames` so it can perform renaming in gpu.module - Update Codegen so it look in the corre [flang][cuda] Update CompilerGeneratedNames pass to work on gpu module (#120660) - Update `CompilerGeneratedNames` so it can perform renaming in gpu.module - Update Codegen so it look in the correct module for the type descriptor. show more ...
# 4530273d	19-Dec-2024	Valentin Clement (バレンタインクレメン) <clementval@gmail.com>	[flang][cuda] Allocate descriptor in managed memory when emboxing device memory (#120485) When emboxing memory that comes from CUFMemAlloc, we need to allocate the descriptor in manage memory as it [flang][cuda] Allocate descriptor in managed memory when emboxing device memory (#120485) When emboxing memory that comes from CUFMemAlloc, we need to allocate the descriptor in manage memory as it might be passed to a kernel. show more ...
# fc97d2e6	18-Dec-2024	Peter Klausler <pklausler@nvidia.com>	[flang] Add UNSIGNED (#113504) Implement the UNSIGNED extension type and operations under control of a language feature flag (-funsigned). This is nearly identical to the UNSIGNED feature that h [flang] Add UNSIGNED (#113504) Implement the UNSIGNED extension type and operations under control of a language feature flag (-funsigned). This is nearly identical to the UNSIGNED feature that has been available in Sun Fortran for years, and now implemented in GNU Fortran for gfortran 15, and proposed for ISO standardization in J3/24-116.txt. See the new documentation for details; but in short, this is C's unsigned type, with guaranteed modular arithmetic for +, -, and *, and the related transformational intrinsic functions SUM & al. show more ...
Revision tags: llvmorg-19.1.6
# 5e1f87e8	17-Dec-2024	Valentin Clement (バレンタインクレメン) <clementval@gmail.com>	[flang][cuda] Correctly allocate memory for descriptor load (#120164) CodeGen will allocate memory for a new descriptor on descriptor loads. CUDA Fortran local descriptor are allocated in managed m [flang][cuda] Correctly allocate memory for descriptor load (#120164) CodeGen will allocate memory for a new descriptor on descriptor loads. CUDA Fortran local descriptor are allocated in managed memory by the runtime. The newly allocated storage for cuda descriptor must also be allocated through the runtime. show more ...
# c91ba043	06-Dec-2024	Michael Kruse <llvm-project@meinersbur.de>	[Flang][NFC] Split runtime headers in preparation for cross-compilation. (#112188) Split some headers into headers for public and private declarations in preparation for #110217. Moving the runtime [Flang][NFC] Split runtime headers in preparation for cross-compilation. (#112188) Split some headers into headers for public and private declarations in preparation for #110217. Moving the runtime-private headers in runtime-private include directory will occur in #110298. * Do not use `sizeof(Descriptor)` in the compiler. The size of the descriptor is target-dependent while `sizeof(Descriptor)` is the size of the Descriptor for the host platform which might be too small when cross-compiling to a different platform. Another problem is that the emitted assembly ((cross-)compiling to the same target) is not identical between Flang's running on different systems. Moving the declaration of `class Descriptor` out of the included header will also reduce the amount of #included sources. * Do not use `sizeof(ArrayConstructorVector)` and `alignof(ArrayConstructorVector)` in the compiler. Same reason as with `Descriptor`. * Compute the descriptor's extra flags without instantiating a Descriptor. `Fortran::runtime::Descriptor` is defined in the runtime source, but not the compiler source. * Move `InquiryKeywordHashDecode` into runtime-private header. The function is defined in the runtime sources and trying to call it in the compiler would lead to a link-error. * Move allocator-kind magic numbers into common header. They are the only declarations out of `allocator-registry.h` in the compiler as well. This does not make Flang cross-compile ready yet, the main goal is to avoid transitive header dependencies from Flang to clang-rt. There are more assumptions that host platform is the same as the target platform. show more ...
Revision tags: llvmorg-19.1.5, llvmorg-19.1.4
# e9fc2faf	15-Nov-2024	Tom Eccles <tom.eccles@arm.com>	[flang][CodeGen] fix bug hoisting allocas using a shared constant arg (#116251) When hoisting the allocas with a constant integer size, the constant integer was moved to where the alloca is hoisted [flang][CodeGen] fix bug hoisting allocas using a shared constant arg (#116251) When hoisting the allocas with a constant integer size, the constant integer was moved to where the alloca is hoisted to unconditionally. By CodeGen there have been various iterations of mlir canonicalization and dead code elimination. This can cause lots of unrelated bits of code to share the same constant values. If for some reason the alloca couldn't be hoisted all of the way to the entry block of the function, moving the constant might result in it no-longer dominating some of the remaining uses. In theory, there should be dominance analysis to ensure the location of the constant does dominate all uses of it. But those constants are effectively free anyway (they aren't even separate instructions in LLVM IR), so it is less expensive just to leave the old one where it was and insert a new one we know for sure is immediately before the alloca. show more ...
# e5092c30	14-Nov-2024	Valentin Clement (バレンタインクレメン) <clementval@gmail.com>	[flang][cuda] Support malloc and free conversion in gpu module (#116112)
# 466b58ba	01-Nov-2024	Valentin Clement (バレンタインクレメン) <clementval@gmail.com>	[flang] Avoid generating duplicate symbol in comdat (#114472) In case where a fir.global might be duplicated in an inner module (gpu.module), the conversion pattern will be applied on the module an [flang] Avoid generating duplicate symbol in comdat (#114472) In case where a fir.global might be duplicated in an inner module (gpu.module), the conversion pattern will be applied on the module and the gpu module version of the global and try to generate multiple comdat with the same symbol name. This is what we have in the implementation of CUDA Fortran. Just check for the presence of the `ComdatSelectorOp` before creating a new one. show more ...
# 0c9a0235	30-Oct-2024	Asher Mancinelli <ashermancinelli@gmail.com>	[flang][fir] always use memcpy for fir.box (#113949) @jeanPerier explained the importance of converting box loads and stores into `memcpy`s instead of aggregate loads and stores, and I'll do my b [flang][fir] always use memcpy for fir.box (#113949) @jeanPerier explained the importance of converting box loads and stores into `memcpy`s instead of aggregate loads and stores, and I'll do my best to explain it here. * [(godbolt link) Example comparing opt transformations on memcpys vs aggregate load/stores](https://godbolt.org/z/be7xM83cG) * LLVM can more effectively reason about memcpys compared to aggregate load/stores. * This came up when others were discussing array descriptors for assumed-rank arrays passed to `bind(c)` subroutines, with the implication that the array descriptors are known to have lower bounds of 1 and that they are not pointer/allocatable types. * [(godbolt link) Clang also uses memcpys so we should probably follow them, assuming the clang developers are generatign what they know Opt will handle more effectively.](https://godbolt.org/z/YT4x7387W) * This currently may not help much without the `nocapture` attribute being propagated to function calls, but [it looks like someone may do this soon (discourse link)](https://discourse.llvm.org/t/applying-the-nocapture-attribute-to-reference-passed-arguments-in-fortran-subroutines/81401/23) or I can do this in a follow-up patch. Note on test `flang/test/Fir/embox-char.fir`: it looks like the original test was auto-generated. I wasn't too sure which parts were especially important to test, so I regenerated the test. If we want the updated version to look more like the old version, I'll make those changes. show more ...
Revision tags: llvmorg-19.1.3
# e6a4346b	18-Oct-2024	Scott Manley <rscottmanley@gmail.com>	[flang] add getElementType() to fir::SquenceType and fir::VectorType (#112770) getElementType() was missing from Sequence and Vector types. Did a replace of the obvious places getEleTy() was used f [flang] add getElementType() to fir::SquenceType and fir::VectorType (#112770) getElementType() was missing from Sequence and Vector types. Did a replace of the obvious places getEleTy() was used for these two types and updated to use this name instead. Co-authored-by: Scott Manley <scmanley@nvidia.com> show more ...
# 2f0b4f43	17-Oct-2024	jeanPerier <jperier@nvidia.com>	[flang][extension] support concatenation with absent optional (#112678) Fix #112593 by adding support in lowering to concatenation with an absent optional _assumed length_ dummy argument because: [flang][extension] support concatenation with absent optional (#112678) Fix #112593 by adding support in lowering to concatenation with an absent optional _assumed length_ dummy argument because: 1. Most compilers seem to support it (most likely by accident). 2. This actually makes the compiler codegen simpler. Codegen was going out of its way to poke the LLVM optimizer bear by producing an undef argument for the length. I insist on the fact that no compiler support this with _explicit length_ optional arguments and the executable will segfault and I would discourage users from using that "feature" because runtime checks for bad optional dereference will kick when used (For instance, "nagfor -C=present" will produce an executable that abort with an error message . Flang does not have such runtime check option so far). Hence, I am not updating the Extensions.md document because this is not something I think we should advertise. show more ...
Revision tags: llvmorg-19.1.2
# cd12ffb6	13-Oct-2024	Abid Qadeer <haqadeer@amd.com>	[mlir][debug] Allow multiple DIGlobalVariableExpression on globals. (#111981) Currently, we allow only one DIGlobalVariableExpressionAttr per global. It is especially evident in import where we pic [mlir][debug] Allow multiple DIGlobalVariableExpression on globals. (#111981) Currently, we allow only one DIGlobalVariableExpressionAttr per global. It is especially evident in import where we pick the first from the list and ignore the rest. In contrast, LLVM allows multiple DIGlobalVariableExpression to be attached to the global. They are needed for correct working of things like DICommonBlock. This PR removes this restriction in mlir. Changes are mostly mechanical. One thing on which I went a bit back and forth was the representation inside GlobalOp. I would be happy to change if there are better ways to do this. --------- Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com> show more ...
# 390943f2	09-Oct-2024	Leandro Lupori <leandro.lupori@linaro.org>	[flang] Implement conversion of compatible derived types (#111165) With some restrictions, BIND(C) derived types can be converted to compatible BIND(C) derived types. Semantics already support thi [flang] Implement conversion of compatible derived types (#111165) With some restrictions, BIND(C) derived types can be converted to compatible BIND(C) derived types. Semantics already support this, but ConvertOp was missing the conversion of such types. Fixes https://github.com/llvm/llvm-project/issues/107783 show more ...
12 3 4 5 6 7 8 9 10 11