Revision tags: llvmorg-21-init, llvmorg-19.1.7 |
|
#
4b17a8b1 |
| 03-Jan-2025 |
Valentin Clement (バレンタイン クレメン) <clementval@gmail.com> |
[flang][cuda] Add operation to sync global descriptor (#121520)
Introduce cuf.sync_descriptor to be used to sync device global
descriptor after pointer association.
Also move CUFCommon so it can
[flang][cuda] Add operation to sync global descriptor (#121520)
Introduce cuf.sync_descriptor to be used to sync device global
descriptor after pointer association.
Also move CUFCommon so it can be used in FIRBuilder lib as well.
show more ...
|
Revision tags: llvmorg-19.1.6, llvmorg-19.1.5 |
|
#
a76609dd |
| 21-Nov-2024 |
Valentin Clement (バレンタイン クレメン) <clementval@gmail.com> |
[flang][cuda] Avoid intrinsics simplification in device context (#117026)
|
Revision tags: llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5 |
|
#
bd9fdce6 |
| 29-Apr-2024 |
Christian Sigg <chsigg@users.noreply.github.com> |
[flang] Use `isa/dyn_cast/cast/...` free functions. (#90432)
The corresponding member functions are deprecated.
|
#
fac349a1 |
| 28-Apr-2024 |
Christian Sigg <chsigg@users.noreply.github.com> |
Reapply "[mlir] Mark `isa/dyn_cast/cast/...` member functions depreca… (#90406)
…ted. (#89998)" (#90250)
This partially reverts commit 7aedd7dc754c74a49fe84ed2640e269c25414087.
This change rem
Reapply "[mlir] Mark `isa/dyn_cast/cast/...` member functions depreca… (#90406)
…ted. (#89998)" (#90250)
This partially reverts commit 7aedd7dc754c74a49fe84ed2640e269c25414087.
This change removes calls to the deprecated member functions. It does
not mark the functions deprecated yet and does not disable the
deprecation warning in TypeSwitch. This seems to cause problems with
MSVC.
show more ...
|
#
7aedd7dc |
| 26-Apr-2024 |
dyung <douglas.yung@sony.com> |
Revert "[mlir] Mark `isa/dyn_cast/cast/...` member functions deprecated. (#89998)" (#90250)
This reverts commit 950b7ce0b88318f9099e9a7c9817d224ebdc6337.
This change is causing build failures on
Revert "[mlir] Mark `isa/dyn_cast/cast/...` member functions deprecated. (#89998)" (#90250)
This reverts commit 950b7ce0b88318f9099e9a7c9817d224ebdc6337.
This change is causing build failures on a bot
https://lab.llvm.org/buildbot/#/builders/216/builds/38157
show more ...
|
#
950b7ce0 |
| 26-Apr-2024 |
Christian Sigg <chsigg@users.noreply.github.com> |
[mlir] Mark `isa/dyn_cast/cast/...` member functions deprecated. (#89998)
See https://mlir.llvm.org/deprecation and
https://discourse.llvm.org/t/preferred-casting-style-going-forward.
|
#
81442f8d |
| 25-Apr-2024 |
Tom Eccles <tom.eccles@arm.com> |
[flang][NFC] Use tablegen to create SimplifyIntrinsics constructor (#89963)
This pass runs on ModuleOp, internally walking all func::CallOps so it
shouldn't need anything special to work on other t
[flang][NFC] Use tablegen to create SimplifyIntrinsics constructor (#89963)
This pass runs on ModuleOp, internally walking all func::CallOps so it
shouldn't need anything special to work on other top level operations.
show more ...
|
Revision tags: llvmorg-18.1.4, llvmorg-18.1.3 |
|
#
a4798bb0 |
| 02-Apr-2024 |
jeanPerier <jperier@nvidia.com> |
[flang][NFC] use mlir::SymbolTable in lowering (#86673)
Whenever lowering is checking if a function or global already exists in
the mlir::Module, it was doing module->lookup.
On big programs (~5
[flang][NFC] use mlir::SymbolTable in lowering (#86673)
Whenever lowering is checking if a function or global already exists in
the mlir::Module, it was doing module->lookup.
On big programs (~5000 globals and functions), this causes important
slowdowns because these lookups are linear. Use mlir::SymbolTable to
speed-up these lookups. The SymbolTable has to be created from the
ModuleOp and maintained in sync. It is therefore placed in the
converter, and FirOPBuilders can take a pointer to it to speed-up the
lookups.
This patch does not bring mlir::SymbolTable to FIR/HLFIR passes, but
some passes creating a lot of runtime calls could benefit from it too.
More analysis will be needed.
As an example of the speed-ups, this patch speeds-up compilation of
Whizard compare_amplitude_UFO.F90 from 5 mins to 2 mins on my machine
(there is still room for speed-ups).
show more ...
|
Revision tags: llvmorg-18.1.2, llvmorg-18.1.1 |
|
#
2a95fe48 |
| 02-Mar-2024 |
David Green <david.green@arm.com> |
[Flang] Allow Intrinsic simpification with min/maxloc dim and scalar result (#81619)
This makes an adjustment to the existing fir minloc/maxloc generation
code to handle functions with a dim=1 that
[Flang] Allow Intrinsic simpification with min/maxloc dim and scalar result (#81619)
This makes an adjustment to the existing fir minloc/maxloc generation
code to handle functions with a dim=1 that produce a scalar result. This
should allow us to get the same benefits as the existing generated
minmax reductions.
This is a recommit of #76194 with an extra alteration to the end of
genRuntimeMinMaxlocBody to make sure we convert the output array to the
correct type (a `box<heap<i32>>`, not `box<heap<array<1xi32>>>`) to
prevent writing the wrong type of box into it. This still allocates the
data as a `array<1xi32>`, converting it into a i32 assuming that is
safe. An alternative would be to allocate the data as a i32 and change
more of the accesses to it throughout genRuntimeMinMaxlocBody.
show more ...
|
Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4 |
|
#
72428962 |
| 21-Feb-2024 |
David Green <david.green@arm.com> |
[Flang] Attempt to fix Nan handling in Minloc/Maxloc intrinsic simplification (#82313)
In certain case "extreme" values like Nan, Inf and 0xffffffff could lead
to generating different code via the
[Flang] Attempt to fix Nan handling in Minloc/Maxloc intrinsic simplification (#82313)
In certain case "extreme" values like Nan, Inf and 0xffffffff could lead
to generating different code via the inline-generated intrinsics vs the
versions in the runtimes (and other compilers like gfortran). There are
some examples I was using for testing in
https://godbolt.org/z/x4EfqEss5.
This changes the generation for the intrinsics to be more like the
runtimes, using a condition that is similar to:
isFirst || (prev != prev && elem == elem) || elem < prev
The middle part is only used for floating point operations, and checks
if the values are Nan. This should then hopefully make the logic closer
to - return the first element with the lowest value, with Nans ignored
unless there are only Nans. The initial limit value for floats are also
changed from the largest float to Inf, to make sure it is handled
correctly.
The integer reductions are also changed to use a similar scheme to make
sure they work with masked values. This means that the preamble after
the loop can be removed.
show more ...
|
Revision tags: llvmorg-18.1.0-rc3 |
|
#
815a8465 |
| 13-Feb-2024 |
David Green <david.green@arm.com> |
[Flang] Move genMinMaxlocReductionLoop to Transforms/Utils.cpp (#81380)
This is one option for attempting to move genMinMaxlocReductionLoop to a
better location. It moves it into Transforms and mak
[Flang] Move genMinMaxlocReductionLoop to Transforms/Utils.cpp (#81380)
This is one option for attempting to move genMinMaxlocReductionLoop to a
better location. It moves it into Transforms and makes HLFIRTranforms
depend upon FIRTransforms.
It passes a build locally, both with and without -DBUILD_SHARED_LIBS,
and does OK on the windows CI.
show more ...
|
Revision tags: llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1 |
|
#
202917f8 |
| 25-Jan-2024 |
David Green <david.green@arm.com> |
[Flang] Move genMinMaxlocReductionLoop to a common location.
The shared library build doesn't like references of genMinMaxlocReductionLoop, in Optimizer/Transforms, from HLFIR/Optimizer/Transforms.
[Flang] Move genMinMaxlocReductionLoop to a common location.
The shared library build doesn't like references of genMinMaxlocReductionLoop, in Optimizer/Transforms, from HLFIR/Optimizer/Transforms. For the moment I've moved the code to the header file where it can be shared, like other methods in Utils.h
show more ...
|
#
223d3dab |
| 25-Jan-2024 |
David Green <david.green@arm.com> |
[Flang] Minloc elemental intrinsic lowering (#74828)
Currently the lowering of a minloc intrinsic with a mask will look something
like:
%e = hlfir.elemental %shape ({
...
})
%m = hlfi
[Flang] Minloc elemental intrinsic lowering (#74828)
Currently the lowering of a minloc intrinsic with a mask will look something
like:
%e = hlfir.elemental %shape ({
...
})
%m = hlfir.minloc %array mask %e
hlfir.assign %m to %result
hlfir.destroy %m
The elemental will be expanded into a temporary+loop, the minloc into a
FortranAMinloc call (which hopefully gets simplified to a specialized call that
can be inlined at the call site), and the assign might get expanded to a
FortranAAssign. It would be better to generate the entire construct as single
loop if we can - one that performs the minloc calculation with the mask
elemental computed inline.
This patch attempt to do that, adding a hlfir version of the expansion code
from SimplifyIntrinsics that turns an minloc+elemental into a single combined
loop nest. It attempts to reuse the methods in genMinlocReductionLoop for
constructing the loop with a modified loop body. The declaration for the
function is currently in Optimizer/Support/Utils.h, but there might be a better
place for it.
It is added as part of the OptimizedBufferizationPass, like the
similar count/any/all that have been added recently.
show more ...
|
Revision tags: llvmorg-19-init |
|
#
4f59a388 |
| 04-Jan-2024 |
Pete Steinfeld <47540744+psteinfeld@users.noreply.github.com> |
Revert #76194 (#76987)
[Flang] Revert "Allow Intrinsic simpification with min/maxloc dim
and…scalar result (#76194)"
This reverts commit 9b7cf5bfb08b6e506216ef354dfd61adb15acbff.
See merge re
Revert #76194 (#76987)
[Flang] Revert "Allow Intrinsic simpification with min/maxloc dim
and…scalar result (#76194)"
This reverts commit 9b7cf5bfb08b6e506216ef354dfd61adb15acbff.
See merge request #76194.
This change was causing several failures in our internal tests. I'm
reverting now and will work on creating a test that David Green can use
to reproduce the problem.
show more ...
|
#
9b7cf5bf |
| 02-Jan-2024 |
David Green <david.green@arm.com> |
[Flang] Allow Intrinsic simpification with min/maxloc dim and scalar result (#76194)
This makes an adjustment to the existing fir minloc/maxloc generation
code to handle functions with a dim=1 that
[Flang] Allow Intrinsic simpification with min/maxloc dim and scalar result (#76194)
This makes an adjustment to the existing fir minloc/maxloc generation
code to handle functions with a dim=1 that produce a scalar result. This
should allow us to get the same benefits as the existing generated
minmax reductions.
This is a recommit of #75820 with the typename added to the generated
function.
show more ...
|
#
0cf3af0c |
| 21-Dec-2023 |
Pete Steinfeld <47540744+psteinfeld@users.noreply.github.com> |
Revert "[Flang] Allow Intrinsic simpification with min/maxloc dim and… (#76184)
… scalar result. (#75820)"
This reverts commit 701f64790520790f75b1f948a752472d421ddaa3.
The commit breaks some
Revert "[Flang] Allow Intrinsic simpification with min/maxloc dim and… (#76184)
… scalar result. (#75820)"
This reverts commit 701f64790520790f75b1f948a752472d421ddaa3.
The commit breaks some uses of the 'maxloc' intrinsic.
See PR #75820
show more ...
|
#
701f6479 |
| 20-Dec-2023 |
David Green <david.green@arm.com> |
[Flang] Allow Intrinsic simpification with min/maxloc dim and scalar result. (#75820)
This makes an adjustment to the existing fir minloc/maxloc generation
code to handle functions with a dim=1 tha
[Flang] Allow Intrinsic simpification with min/maxloc dim and scalar result. (#75820)
This makes an adjustment to the existing fir minloc/maxloc generation
code to handle functions with a dim=1 that produce a scalar result. This
should allow us to get the same benefits as the existing generated
minmax reductions.
show more ...
|
#
9bb47f7f |
| 18-Dec-2023 |
David Green <david.green@arm.com> |
[Flang] Add Maxloc to fir simplify intrinsics pass (#75463)
This takes the code from D144103 and extends it to maxloc, to allow the
simplifyMinMaxlocReduction method to work with both min and max
[Flang] Add Maxloc to fir simplify intrinsics pass (#75463)
This takes the code from D144103 and extends it to maxloc, to allow the
simplifyMinMaxlocReduction method to work with both min and max
intrinsics by switching condition and limit/initial value.
show more ...
|
#
11efccea |
| 14-Dec-2023 |
Kazu Hirata <kazu@google.com> |
[flang] Use StringRef::{starts,ends}_with (NFC)
This patch replaces uses of StringRef::{starts,ends}with with StringRef::{starts,ends}_with for consistency with std::{string,string_view}::{starts,en
[flang] Use StringRef::{starts,ends}_with (NFC)
This patch replaces uses of StringRef::{starts,ends}with with StringRef::{starts,ends}_with for consistency with std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove StringRef::{starts,ends}with.
show more ...
|
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3 |
|
#
89b98c13 |
| 22-Aug-2023 |
Slava Zakharin <szakharin@nvidia.com> |
[flang] Fixed simplification for FP maxval.
On x86, a simplified F128 maxval ends up calling fmaxl that does not work properly for F128 arguments. It is probably an LLVM issue, but we also should no
[flang] Fixed simplification for FP maxval.
On x86, a simplified F128 maxval ends up calling fmaxl that does not work properly for F128 arguments. It is probably an LLVM issue, but we also should not use arith.maxf if NaN or -0.0 operands are possible. The change is to use cmpf and select. Unfortunately, these arith ops do not support FastMathFlags currently, so I will have to fix this sooner or later (depending on how this affects performance).
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D158200
show more ...
|
Revision tags: llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init |
|
#
f52c64b1 |
| 06-Jul-2023 |
David Truby <david@truby.dev> |
[flang] Add fastmath flags to localBuilder in IntrinsicCall
Currently the local builder used in IntrinsicCall doesn't have the fastmath flags passed to it. This results in the fastmath attribute not
[flang] Add fastmath flags to localBuilder in IntrinsicCall
Currently the local builder used in IntrinsicCall doesn't have the fastmath flags passed to it. This results in the fastmath attribute not being added to certain runtime calls. This patch simply forwards the fastmath flags from the parent builder.
Differential Revision: https://reviews.llvm.org/D154611
show more ...
|
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4 |
|
#
7a607e25 |
| 03-May-2023 |
Slava Zakharin <szakharin@nvidia.com> |
[flang] Removed unnecessary llvm/CodeGen/SelectionDAGNodes.h include.
Required after D148767 for flang+debug+slibs build.
Reviewed By: chapuni, clementval
Differential Revision: https://reviews.ll
[flang] Removed unnecessary llvm/CodeGen/SelectionDAGNodes.h include.
Required after D148767 for flang+debug+slibs build.
Reviewed By: chapuni, clementval
Differential Revision: https://reviews.llvm.org/D149764
show more ...
|
Revision tags: llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4 |
|
#
b07ef9e7 |
| 09-Mar-2023 |
Renaud-K <rkauffmann@nvidia.com> |
Break circular dependency between FIR dialect and utilities
|
#
242bb0b6 |
| 28-Feb-2023 |
Sacha Ballantyne <Sacha.Ballantyne@arm.com> |
[flang] Fix a bug with simplified minloc that treated logicals with even values > 1 as 0
Previously the mask would be loaded as the appropriate integer type and cast to I1 to pass to fir.if, however
[flang] Fix a bug with simplified minloc that treated logicals with even values > 1 as 0
Previously the mask would be loaded as the appropriate integer type and cast to I1 to pass to fir.if, however this truncates the integer and so would cast 6 to 0. By loading values as logicals and casting to I1 this problem is avoided.
Reviewed By: Leporacanthicus
Differential Revision: https://reviews.llvm.org/D144974
show more ...
|
#
79dccded |
| 28-Feb-2023 |
Sacha Ballantyne <Sacha.Ballantyne@arm.com> |
[flang] Change COUNT intrinsic to support different kind logicals
Previously COUNT would cast the mask input to logical<4> before passing it to the runtime function, this has been changed to allow d
[flang] Change COUNT intrinsic to support different kind logicals
Previously COUNT would cast the mask input to logical<4> before passing it to the runtime function, this has been changed to allow different types of logical.
Reviewed By: tblah
Differential Revision: https://reviews.llvm.org/D144867
show more ...
|