Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3 |
|
#
a59f7124 |
| 23-Oct-2024 |
jeanPerier <jperier@nvidia.com> |
[flang][hlfir] do not consider local temps as conflicting in assignment (#113330)
Last patch required to avoid creating a temporary for the LHS when
dealing with `x([a,b]) = y`.
The code dealing
[flang][hlfir] do not consider local temps as conflicting in assignment (#113330)
Last patch required to avoid creating a temporary for the LHS when
dealing with `x([a,b]) = y`.
The code dealing with "ordered assignments" (where, forall, user and
vector subscripted assignments) is saving the evaluated RHS/LHS and
masks if they have write effects because this write effects should not
be evaluated when they affect entities that may be written to in other
contexts after the evaluation and before the re-evaluation.
But when dealing with write to storage allocated in the region for the
expression being evluated, there is no problem to re-evaluate the write:
it has no effect outside of the expression evaluation that owns the
allocation.
In the case of `x([a,b]) = y`, the temporary is created for the vector
subscript. Raising the HLFIR abstraction for simple array constructors
may be a good idea, but local temps are created in other contexts, so
this fix is more generic.
show more ...
|
Revision tags: llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init |
|
#
2c1ae801 |
| 19-Jun-2024 |
donald chen <chenxunyu1993@gmail.com> |
[mlir][side effect] refactor(*): Include more precise side effects (#94213)
This patch adds more precise side effects to the current ops with memory
effects, allowing us to determine which OpOperan
[mlir][side effect] refactor(*): Include more precise side effects (#94213)
This patch adds more precise side effects to the current ops with memory
effects, allowing us to determine which OpOperand/OpResult/BlockArgument
the
operation reads or writes, rather than just recording the reading and
writing
of values. This allows for convenient use of precise side effects to
achieve
analysis and optimization.
Related discussions:
https://discourse.llvm.org/t/rfc-add-operandindex-to-sideeffect-instance/79243
show more ...
|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2 |
|
#
7095a86f |
| 04-Aug-2023 |
Slava Zakharin <szakharin@nvidia.com> |
[flang][hlfir] Fixed where/elsewhere mask saving in case of conflicts.
The assignments inside where/elsewhere may affect variables participating in the mask expression, but execution of the assignme
[flang][hlfir] Fixed where/elsewhere mask saving in case of conflicts.
The assignments inside where/elsewhere may affect variables participating in the mask expression, but execution of the assignments must not affect the established control mask(s) (F'18 10.2.3.2 p. 13).
The following example must print all 42's: ``` program test integer c(3) logical :: mask(3) = .true. where (mask) c = f() end where print *, c contains integer function f() mask = .false. f = 42 end function f end program test ```
Reviewed By: tblah
Differential Revision: https://reviews.llvm.org/D156959
show more ...
|
Revision tags: llvmorg-17.0.0-rc1, llvmorg-18-init |
|
#
e52a6d77 |
| 05-Jul-2023 |
Jean Perier <jperier@nvidia.com> |
[flang][hlfir] avoid useless LHS temporaries inside WHERE
The need to save LHS addresses on a stack before doing an assignment is very limited: it is only really needed for forall and vectore subscr
[flang][hlfir] avoid useless LHS temporaries inside WHERE
The need to save LHS addresses on a stack before doing an assignment is very limited: it is only really needed for forall and vectore subscripted LHS where the LHS cannot be computed as a descriptor.
The previous current WHERE codegen was creating address stacks for LHS element addresses when the LHS evaluation conflicts with the assignment (may depend on the LHS value). This is not needed since the computed array designator for the LHS is already "saved" before the assignment from an SSA point of view.
This patch prevents LHS temporary stack from being created outside of forall and vector subscripted assignments.
Differential Revision: https://reviews.llvm.org/D154418
show more ...
|
#
6c14e849 |
| 27-Jun-2023 |
Jean Perier <jperier@nvidia.com> |
[flang][hlfir] Add codegen for vector subscripted LHS
This patch adds support for vector subscripted assignment left-hand side. It does not yet add support for the cases where the LHS must be saved
[flang][hlfir] Add codegen for vector subscripted LHS
This patch adds support for vector subscripted assignment left-hand side. It does not yet add support for the cases where the LHS must be saved because its evaluation could be impacted by the assignment.
The implementation adds an hlfir::ElementalOpInterface to share the elemental inlining utility and some other tools between hlfir::ElementalOp and hlfir::ElelemntalAddrOp.
It adds generateYieldedLHS() to allow retrieving the LHS value in lowering, whether or not it is vector subscripted. If it is vector subscripted, this utility creates a loop nest iterating over the elements and returns the address of an element.
Differential Revision: https://reviews.llvm.org/D153759
show more ...
|
#
92311347 |
| 26-Jun-2023 |
Jean Perier <jperier@nvidia.com> |
[flang][hlfir] user defined assignment codegen
Add codegen support for hlfir.region_assign with user defined assignment.
It is currently a bit pessimistic, because outside of forall, it does not us
[flang][hlfir] user defined assignment codegen
Add codegen support for hlfir.region_assign with user defined assignment.
It is currently a bit pessimistic, because outside of forall, it does not use the PURE aspect, if any, of the assignment routine to rule out that the routine can write to something else than the LHS that could overlap with the RHS. However, the current lowering is anyway adding parenthesis around the RHS, so this should not cause performance regressions.
Differential Revision: https://reviews.llvm.org/D153516
show more ...
|
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5 |
|
#
222a8a1b |
| 22-May-2023 |
Jean Perier <jperier@nvidia.com> |
[flang][hlfir] Enable WHERE scheduling in LowerHLFIROrderedAssignments
Nothing special is needed, other than adding the logging code for where masks and to plug the pattern. This tests mainly adds t
[flang][hlfir] Enable WHERE scheduling in LowerHLFIROrderedAssignments
Nothing special is needed, other than adding the logging code for where masks and to plug the pattern. This tests mainly adds test.
Note that some of the justifications to create temps shows some lacks of side effect interface on operations (like hlfir.transpose), or on some transparent llvm intrinsic calls (llvm.stacksave/restore). I think we should as much as possible try to improve this on the ops generate code rather than special casing it here.
Differential Revision: https://reviews.llvm.org/D150581
show more ...
|
#
4f30a63c |
| 17-May-2023 |
Jean Perier <jperier@nvidia.com> |
[flang][hlfir] Implement the scheduling part of hlfir.forall codegen
The lowering of hlfir.forall to loops (and later hlfir.where) requires doing a data dependency analysis to avoid creating tempora
[flang][hlfir] Implement the scheduling part of hlfir.forall codegen
The lowering of hlfir.forall to loops (and later hlfir.where) requires doing a data dependency analysis to avoid creating temporary storage for every control/mask/rhs/lhs expressions.
The added code implements a data dependency analysis for the hlfir ordered assignment trees (it is not specific to Forall since these nodes includes Where, user defined assignments, and assignment to vector subscripted entities, but the added code is only plugged and tested with hlfir.forall in this patch).
This data dependency analysis returns a "schedule", which is a list of runs containing actions. Each runs will result in a single loop nest evaluating all its action "at the same time" inside the loop body. Actions may either evaluating an assignment, or saving some expression evaluation (the value yielded inside the ordered assignment hlfir operations) in a temporary storage before doing the assignment that requires this expression value but may "conflict" with it.
A "conflict" is a read in an expression E to a variable that is, or may be (analysis is conservative), written by an assignment that depends on E.
The analysis is based on MLIR SideEffectInterface and fir AliasAnalysis which makes it generic.
For now, the codegen that will apply the schedule and rewrite the hlfir.forall into a set of loops is not implemented, but the scheduling is tested on its own (from Fortran, because it allows testing many cases in very readable fashions).
The current scheduling has limitations, for instance "forall(i=1, 10) x(i)=2*x(i)" does not require saving the RHS values for all "i" before doing the assignments since the RHS does not depend on values computed during previous iterations. Any user call will also trigger a conservative assumption that there is a conflict. Finally, a lot of operations are missing memory effect interfaces (especially in HLFIR). This patch adds a few so that it can be tested, but more will be added in later patches.
Differential Revision: https://reviews.llvm.org/D150455
show more ...
|