Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4 |
|
#
42be165d |
| 16-Nov-2024 |
Valentin Clement <clementval@gmail.com> |
Reland '[flang][cuda] Specialize entry point for scalar to desc data transfer'
|
#
70b9440c |
| 16-Nov-2024 |
Valentin Clement (バレンタイン クレメン) <clementval@gmail.com> |
Revert "[flang][cuda] Specialize entry point for scalar to desc data transfer" (#116458)
Reverts llvm/llvm-project#116457
|
#
43cb424a |
| 16-Nov-2024 |
Valentin Clement (バレンタイン クレメン) <clementval@gmail.com> |
[flang][cuda] Specialize entry point for scalar to desc data transfer (#116457)
The runtime Assign function is not meant to initialize an array from a
scalar. For that we need to use DoAssignFromSo
[flang][cuda] Specialize entry point for scalar to desc data transfer (#116457)
The runtime Assign function is not meant to initialize an array from a
scalar. For that we need to use DoAssignFromSource. Update the data
transfer from scalar to descriptor to use a new entry point that use
this function underneath.
show more ...
|
#
07e053fb |
| 05-Nov-2024 |
Peter Klausler <pklausler@nvidia.com> |
[flang][runtime] Fix finalization case in assignment (#113611)
There were two bugs in derived type array assignment processing that
caused finalization to fail to occur for a test case. The first b
[flang][runtime] Fix finalization case in assignment (#113611)
There were two bugs in derived type array assignment processing that
caused finalization to fail to occur for a test case. The first bug was
an off-by-one error in address overlap testing that caused a false
positive result for the test, whose left-hand side's allocatable's
descriptor was immediately adjacent in memory to the right-hand side's
array's data.
The second bug was that in such overlap cases (even when legitimate)
finalization would fail due to the LHS's descriptor having been copied
to a temporary for deferred deallocation and then nullified.
This patch corrects the overlap analysis for this test, and also
properly finalizes the LHS when overlap does exist. Some nearby dead
code was removed to avoid future confusion.
Fixes https://github.com/llvm/llvm-project/issues/113375.
show more ...
|
#
7792dbe2 |
| 01-Nov-2024 |
Valentin Clement (バレンタイン クレメン) <clementval@gmail.com> |
Reland '[flang][runtime] Allow different memmov function in assign' (#114587)
Reland #114301
|
#
c5a254cd |
| 01-Nov-2024 |
Valentin Clement (バレンタイン クレメン) <clementval@gmail.com> |
Revert "[flang][runtime][NFC] Allow different memmove function in assign" (#114581)
Reverts llvm/llvm-project#114301
|
#
b278fe32 |
| 01-Nov-2024 |
Valentin Clement (バレンタイン クレメン) <clementval@gmail.com> |
[flang][runtime][NFC] Allow different memmove function in assign (#114301)
- Add a parameter to the `Assign` function to be able to use a different
`memmove` function. This is preparatory work to b
[flang][runtime][NFC] Allow different memmove function in assign (#114301)
- Add a parameter to the `Assign` function to be able to use a different
`memmove` function. This is preparatory work to be able to use the
`Assign` function between host and device data.
- Expose the `Assign` function so it can be used from different files.
- The new `memmoveFct` is not used in `BlankPadCharacterAssignment` yet
since it is not clear if there is a need. It will be updated in case it
is needed.
show more ...
|
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0 |
|
#
050f785e |
| 11-Sep-2024 |
Philip Reames <preames@rivosinc.com> |
Revert "[flang][runtime] Fix odd "invalid descriptor" runtime crash (#107785)"
This reverts commit 15106c26662a573df31e8dfdd9350c313b8bfd84. Commit does not pass check-flang on x86 host.
|
#
15106c26 |
| 10-Sep-2024 |
Peter Klausler <pklausler@nvidia.com> |
[flang][runtime] Fix odd "invalid descriptor" runtime crash (#107785)
A defined assignment generic interface for a given LHS/RHS type & rank
combination may have a specific procedure with LHS dummy
[flang][runtime] Fix odd "invalid descriptor" runtime crash (#107785)
A defined assignment generic interface for a given LHS/RHS type & rank
combination may have a specific procedure with LHS dummy argument that
is neither allocatable nor pointer, or specific procedure(s) whose LHS
dummy arguments are allocatable or pointer. It is possible to have two
specific procedures if one's LHS dummy argument is allocatable and the
other's is pointer.
However, the runtime doesn't work with LHS dummy arguments that are
allocatable, and will crash with a mysterious "invalid descriptor" error
message.
Extend the list of special bindings to include
ScalarAllocatableAssignment and ScalarPointerAssignment, use them when
appropriate in the runtime type information tables, and handle them in
Assign() in the runtime support library.
show more ...
|
#
797f0119 |
| 05-Sep-2024 |
Leandro Lupori <leandro.lupori@linaro.org> |
[flang][OpenMP] Make lastprivate work with reallocated variables (#106559)
Fixes https://github.com/llvm/llvm-project/issues/100951
|
Revision tags: llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init |
|
#
9f44d5d9 |
| 18-Jun-2024 |
jeanPerier <jperier@nvidia.com> |
[flang] Simplify copy-in copy-out runtime API (#95822)
The runtime API for copy-in copy-out currently only has an entry only
for the copy-out. This entry has a "skipInit" boolean that is never set
[flang] Simplify copy-in copy-out runtime API (#95822)
The runtime API for copy-in copy-out currently only has an entry only
for the copy-out. This entry has a "skipInit" boolean that is never set
to false by lowering and it does not deal with the deallocation of the
temporary.
The generated code was a mix of inline code and runtime calls This is not a big deal,
but this is unneeded compiler and generated code complexity.
With assumed-rank, it is also more cumbersome to establish a
temporary descriptor.
Instead, this patch:
- Adds a CopyInAssignment API that deals with establishing the temporary
descriptor and does the copy.
- Removes unused arg to CopyOutAssign, and pushes
destruction/deallocation responsibility inside it.
Note that this runtime API are still not responsible for deciding the
need of copying-in and out. This is kept as a separate runtime call to
IsContiguous, which is easier to inline/replace by inline code with the
hope of removing the copy-in/out calls after user function inlining.
@vzakhari has already shown that always inlining all the copy part
increase Fortran compilation time due to loop optimization attempts for
loops that are known to have little optimization profitability (the
variable being copied from and to is not contiguous).
show more ...
|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3 |
|
#
5f6e0f35 |
| 15-Feb-2024 |
jeanPerier <jperier@nvidia.com> |
[flang][runtime] Destroy nested allocatable components (#81117)
The runtime was currently only deallocating the direct allocatable
components, which caused leaks when there are allocatable componen
[flang][runtime] Destroy nested allocatable components (#81117)
The runtime was currently only deallocating the direct allocatable
components, which caused leaks when there are allocatable components
nested in the direct components.
Update Destroy to recursively destroy components.
Also call Destroy from Assign to deallocate nested allocatable
components before doing the assignment as required by F2018 9.7.3.2
point 7.
This lack of deallocation was visible if the nested components had user
defined assignment "observing" the allocation state.
show more ...
|
Revision tags: llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1 |
|
#
887783e0 |
| 25-Jan-2024 |
Peter Klausler <35819229+klausler@users.noreply.github.com> |
[flang][runtime] Invert component/element loops in assignment (#78341)
The general implementation of intrinsic assignment of derived types in
the runtime support library has a doubly-nested loop: a
[flang][runtime] Invert component/element loops in assignment (#78341)
The general implementation of intrinsic assignment of derived types in
the runtime support library has a doubly-nested loop: an outer loop that
traverses the components and inner loops than traverse the array
elements. It's done this way to amortize the per-component overhead.
However, this turns out to be wrong when the program cares about the
order in which defined assignment subroutines are called; the Fortran
standard allows less latitude here than we need to invert the ordering
in this way when any component is itself an array. So invert the two
loops: traverse the array elements, and for each element, traverse its
components.
show more ...
|
Revision tags: llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3 |
|
#
f92309a3 |
| 09-Oct-2023 |
Slava Zakharin <szakharin@nvidia.com> |
[flang][runtime] Workaround cuda-11.8 compilation issue. (#68459)
cuda-11.8 nvcc cannot handle brace initialization of the lambda
object. 12.1 works okay, but I would like to have an option
to use
[flang][runtime] Workaround cuda-11.8 compilation issue. (#68459)
cuda-11.8 nvcc cannot handle brace initialization of the lambda
object. 12.1 works okay, but I would like to have an option
to use 11.8.
show more ...
|
#
8b953fdd |
| 04-Oct-2023 |
Slava Zakharin <szakharin@nvidia.com> |
[flang][runtime] Added Assign runtime to CUDA build closure. (#68171)
|
Revision tags: llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2 |
|
#
b21c24c3 |
| 29-Jul-2023 |
Peter Klausler <pklausler@nvidia.com> |
[flang][runtime] Recognize and handle FINAL subroutines with contiguous dummy arrays when data are not so
When a FINAL subroutine is being invoked for a discontiguous array, which can happen for INT
[flang][runtime] Recognize and handle FINAL subroutines with contiguous dummy arrays when data are not so
When a FINAL subroutine is being invoked for a discontiguous array, which can happen for INTENT(OUT) dummy arguments and for some left-hand side variables in intrinsic assignment statements, it may be the case that the subroutine being called was defined with a dummy argument that requires contiguous data.
Extend the derived type descriptions used by the runtime to signify when a special procedure binding requires contiguity; set the flags accordingly; check them in the runtime support library, and, when necessary, use a temporary shallow copy of the finalized array data in the call to the final subroutine.
Differential Revision: https://reviews.llvm.org/D156760
show more ...
|
Revision tags: llvmorg-17.0.0-rc1 |
|
#
c78b528f |
| 27-Jul-2023 |
Slava Zakharin <szakharin@nvidia.com> |
[flang][runtime] Handle conflicts for derived types with dynamic components.
When creating a temporary for conflicting LHS and RHS we have to deep copy the dynamic (allocatable, automatic) component
[flang][runtime] Handle conflicts for derived types with dynamic components.
When creating a temporary for conflicting LHS and RHS we have to deep copy the dynamic (allocatable, automatic) components from RHS to the temp. Otherwise, the conflict may still be present between LHS and temp.
gfortran/regression/alloc_comp_assign_1.f90 is an example where the current runtime code produces wrong result: https://github.com/llvm/llvm-test-suite/blob/7b5b5dcbf9bdde729a14722eb67f9c3ab01647c7/Fortran/gfortran/regression/alloc_comp_assign_1.f90#L50
Reviewed By: klausler, tblah
Differential Revision: https://reviews.llvm.org/D156364
show more ...
|
Revision tags: llvmorg-18-init |
|
#
3a4e9f7a |
| 26-Jun-2023 |
Slava Zakharin <szakharin@nvidia.com> |
[flang][hlfir] Do not dereference unallocated entities in structure constructor.
Component-by-component assignment must be able to handle unallocated allocatable values in structure constructor. F20
[flang][hlfir] Do not dereference unallocated entities in structure constructor.
Component-by-component assignment must be able to handle unallocated allocatable values in structure constructor. F2018 7.5.10 p. 7 states that the component must have unallocated status as a result of such construction. The structure constructor temporary is initialized such that all the allocatable components are unallocated, so we just need to make sure not to do the component assignment if RHS is deallocated.
Depends on D152482 (the same LIT test is affected)
Reviewed By: jeanPerier, tblah
Differential Revision: https://reviews.llvm.org/D152493
show more ...
|
Revision tags: llvmorg-16.0.6 |
|
#
6e4984a9 |
| 06-Jun-2023 |
Slava Zakharin <szakharin@nvidia.com> |
[flang][hlfir] Enable assignments with allocatable components.
The TODO was left there to verify that Assign() runtime handles overlaps of allocatable components. It did not, and this change-set fix
[flang][hlfir] Enable assignments with allocatable components.
The TODO was left there to verify that Assign() runtime handles overlaps of allocatable components. It did not, and this change-set fixes it. Note that the same Assign() issue can be reproduced without HLFIR. In the following example the LHS would be reallocated before value of RHS (essentially, the same memory) is read: ``` program main type t1 integer, allocatable :: a(:) end type t1 type(t1) :: x, y allocate(x%a(10)) do i =1,10 x%a(i) = 2*i end do x = x print *, x%a deallocate(x%a) end program main ```
The test's output would be incorrect (though, this depends on the memory reuse by malloc): 0 0 0 0 10 12 14 16 18 20
It is very hard to add a Flang unittest exploiting derived types.
Reviewed By: klausler
Differential Revision: https://reviews.llvm.org/D152306
show more ...
|
Revision tags: llvmorg-16.0.5 |
|
#
da60b9e7 |
| 23-May-2023 |
Slava Zakharin <szakharin@nvidia.com> |
[flang] Fixed managing copy-in/copy-out temps.
There are several observations regarding the copy-in/copy-out: * Actual argument associated with INTENT(OUT) dummy argument that requires finaliz
[flang] Fixed managing copy-in/copy-out temps.
There are several observations regarding the copy-in/copy-out: * Actual argument associated with INTENT(OUT) dummy argument that requires finalization (7.5.6.3 p. 7) may be read by the finalization function, so a copy-in is required. * A temporary created for the copy-in/copy-out must be destroyed without finalization after the call (or after the corresponding copy-out), otherwise, memory leaks may occur. * The copy-out assignment must not perform finalization for the LHS. * The copy-out assignment from the temporary to the actual argument may or may not need to initialize the LHS.
This change-set introduces new runtime methods: CopyOutAssign and DestroyWithoutFinalization. They are called by the compiler generated code to match the behavior described above.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D151135
show more ...
|
Revision tags: llvmorg-16.0.4 |
|
#
bf536456 |
| 15-May-2023 |
Slava Zakharin <szakharin@nvidia.com> |
[flang][runtime] Fixed memory leak in Assign().
The temporary descriptor must be either Pointer or Allocatable, otherwise its memory will not be freed.
Reviewed By: klausler
Differential Revision:
[flang][runtime] Fixed memory leak in Assign().
The temporary descriptor must be either Pointer or Allocatable, otherwise its memory will not be freed.
Reviewed By: klausler
Differential Revision: https://reviews.llvm.org/D150534
show more ...
|
#
7c7ffa7b |
| 15-May-2023 |
Slava Zakharin <szakharin@nvidia.com> |
[flang][runtime] Fixed dimension offset computation for MayAlias.
Reviewed By: klausler
Differential Revision: https://reviews.llvm.org/D150533
|
#
1aff61ec |
| 02-May-2023 |
Slava Zakharin <szakharin@nvidia.com> |
[flang][runtime] Initialize LHS temporary in AssignTemporary.
If LHS is of derived type that needs initialization, then it must be initialized before doing the assignment. Otherwise, the assignment
[flang][runtime] Initialize LHS temporary in AssignTemporary.
If LHS is of derived type that needs initialization, then it must be initialized before doing the assignment. Otherwise, the assignment might behave incorrectly based on uninitialized components that are descriptors themselves.
Differential Revision: https://reviews.llvm.org/D149681
show more ...
|
Revision tags: llvmorg-16.0.3 |
|
#
1ac31a0b |
| 25-Apr-2023 |
Jean Perier <jperier@nvidia.com> |
[flang][runtime] Fix padding in CHARACTER(4) assignments.
One piece of pointer arithmetic was adding the number of bytes instead of the number of characters. This caused failures in CHARACTER(KIND>1
[flang][runtime] Fix padding in CHARACTER(4) assignments.
One piece of pointer arithmetic was adding the number of bytes instead of the number of characters. This caused failures in CHARACTER(KIND>1) that required padding. This was caught using HLFIR that currently uses the runtime for array assignment where the current lowering does everything inline.
Reviewed By: vzakhari, klausler
Differential Revision: https://reviews.llvm.org/D149062
show more ...
|
Revision tags: llvmorg-16.0.2 |
|
#
3acdd596 |
| 07-Apr-2023 |
Valentin Clement <clementval@gmail.com> |
[flang] Handle correctly user defined assignment for allocatable component
In the Fortran standard 2018 section 10.2.1.3 (13), it is mentioned that all noncoarray allocatable component must follow t
[flang] Handle correctly user defined assignment for allocatable component
In the Fortran standard 2018 section 10.2.1.3 (13), it is mentioned that all noncoarray allocatable component must follow this sequence of operations:
1) If the component of the variable is allocated, it is deallocated. 2) If the component of the value of expr is allocated, the corresponding component of the variable is allocated with the same dynamic type and type parameters as the component of the value of expr. If it is an array, it is allocated with the same bounds. The value of the component of the value of expr is then assigned to the corresponding component of the variable using defined assignment if the declared type of the component has a type-bound defined assignment consistent with the component, and intrinsic assignment for the dynamic type of that component otherwise.
This patch updates the code to make use of the user defined assignment for allocatable component and make sure the component is allocated correctly.
Reviewed By: klausler
Differential Revision: https://reviews.llvm.org/D147797
show more ...
|