#
376713ff |
| 14-Nov-2024 |
Peter Klausler <pklausler@nvidia.com> |
[flang] Accept CLASS(*) array in EOSHIFT (#116114)
The intrinsic processing code wasn't allowing the ARRAY= argument to the
EOSHIFT intrinsic function to be CLASS(*). That case seems to conform to
[flang] Accept CLASS(*) array in EOSHIFT (#116114)
The intrinsic processing code wasn't allowing the ARRAY= argument to the
EOSHIFT intrinsic function to be CLASS(*). That case seems to conform to
the standard, although only one compiler could actually handle it, so
allow for it.
Fixes https://github.com/llvm/llvm-project/issues/115923.
show more ...
|
#
fc51c7f0 |
| 19-Sep-2024 |
Slava Zakharin <szakharin@nvidia.com> |
[flang][runtime] Disable LDBL_MANT_DIG == 113 for the offload builds. (#109339)
When compiling on aarch64 some `LDBL_MANT_DIG == 113` entries
end up trying to use `complex<long double>` for which t
[flang][runtime] Disable LDBL_MANT_DIG == 113 for the offload builds. (#109339)
When compiling on aarch64 some `LDBL_MANT_DIG == 113` entries
end up trying to use `complex<long double>` for which there are
no certain specializations in `libcudacxx`. This change-set
includes a clean-up for `LDBL_MANT_DIG == 113` usage, which is replaced
with `HAS_LDBL128` that is set in `float128.h`.
show more ...
|
#
104f3c18 |
| 19-Sep-2024 |
Slava Zakharin <szakharin@nvidia.com> |
Reland "[flang][runtime] Use cuda::std::complex in F18 runtime CUDA build. (#109078)" (#109207)
`std::complex` operators do not work for the CUDA device compilation
of F18 runtime. This change make
Reland "[flang][runtime] Use cuda::std::complex in F18 runtime CUDA build. (#109078)" (#109207)
`std::complex` operators do not work for the CUDA device compilation
of F18 runtime. This change makes use of `cuda::std::complex` from
`libcudacxx`.
`cuda::std::complex` does not have specializations for `long double`,
so the change is accompanied with a clean-up for `long double` usage.
Additional change on top of #109078 is to use `cuda::std::complex`
only for the device compilation, otherwise the host compilation
fails because `libcudacxx` may not support `long double` specialization
at all (depending on the compiler).
show more ...
|
#
36192fdf |
| 18-Sep-2024 |
Slava Zakharin <szakharin@nvidia.com> |
Revert "[flang][runtime] Use cuda::std::complex in F18 runtime CUDA build." (#109173)
Reverts llvm/llvm-project#109078
|
#
be187a68 |
| 18-Sep-2024 |
Slava Zakharin <szakharin@nvidia.com> |
[flang][runtime] Use cuda::std::complex in F18 runtime CUDA build. (#109078)
`std::complex` operators do not work for the CUDA device compilation
of F18 runtime. This change makes use of `cuda::std
[flang][runtime] Use cuda::std::complex in F18 runtime CUDA build. (#109078)
`std::complex` operators do not work for the CUDA device compilation
of F18 runtime. This change makes use of `cuda::std::complex` from
`libcudacxx`.
`cuda::std::complex` does not have specializations for `long double`,
so the change is accompanied with a clean-up for `long double` usage.
show more ...
|
#
b009f468 |
| 26-Jul-2024 |
serge-sans-paille <serge.guelton@telecom-bretagne.eu> |
[Flang][Runtime] Explicitly convert shift value to SubscriptValue (#99822)
Shift value are within the range of SubscriptValue but the API forces
them to 64bits. This assumption doesn't hold for 32b
[Flang][Runtime] Explicitly convert shift value to SubscriptValue (#99822)
Shift value are within the range of SubscriptValue but the API forces
them to 64bits. This assumption doesn't hold for 32bit machine, so add
an explicit cast.
show more ...
|
#
8fc045e2 |
| 26-Dec-2023 |
Peter Klausler <35819229+klausler@users.noreply.github.com> |
[flang][runtime] Accept 128-bit integer SHIFT values in CSHIFT/EOSHIFT (#75246)
It would surprise me if this case ever arose outside a couple of tests
in llvm-test-suite/Fortran/gfortran/regression
[flang][runtime] Accept 128-bit integer SHIFT values in CSHIFT/EOSHIFT (#75246)
It would surprise me if this case ever arose outside a couple of tests
in llvm-test-suite/Fortran/gfortran/regression (namely
cshift_large_1.f90 and eoshift_large_1.f90), but now at least those
tests will pass.
show more ...
|
#
04b18530 |
| 29-Nov-2023 |
Pete Steinfeld <47540744+psteinfeld@users.noreply.github.com> |
[flang] Cleanup of NYI messages (#73740)
This update makes the user visible messages relating to features that
are not yet implemented be more consistent. I also cleaned up some of
the code.
Fo
[flang] Cleanup of NYI messages (#73740)
This update makes the user visible messages relating to features that
are not yet implemented be more consistent. I also cleaned up some of
the code.
For NYI messages that refer to intrinsics, I made sure the the message
begins with "not yet implemented: intrinsic:" to make them easier to
recognize.
I created some utility functions for NYI reporting that I put into
.../include/Optimizer/Support/Utils.h. These mainly convert MLIR types
to their Fortran equivalents.
I converted the NYI code to use the newly created utility functions.
show more ...
|
#
478e0b58 |
| 17-Jul-2023 |
Peter Steinfeld <psteinfeld@nvidia.com> |
[flang] Quadmath 128 bit floating point intrinsics
This update allows constant folding for many 128 bit floating point intrinsics through the library quadmath, which is only available on some platfo
[flang] Quadmath 128 bit floating point intrinsics
This update allows constant folding for many 128 bit floating point intrinsics through the library quadmath, which is only available on some platforms.
Differential Revision: https://reviews.llvm.org/D156435
show more ...
|
#
3212051c |
| 22-May-2023 |
Slava Zakharin <szakharin@nvidia.com> |
[RFC][flang] Experimental device build of Flang runtime.
These are initial changes to experiment with building the Fortran runtime as a CUDA or OpenMP target offload library.
The initial patch defi
[RFC][flang] Experimental device build of Flang runtime.
These are initial changes to experiment with building the Fortran runtime as a CUDA or OpenMP target offload library.
The initial patch defines a set of macros that have to be used consistently in Flang runtime source code so that it can be built for different offload devices using different programming models (CUDA, HIP, OpenMP target offload). Currently supported modes are: * CUDA: Flang runtime may be built as a fatlib for the host and a set of CUDA architectures specified during the build. The packaging of the device code is done by the CUDA toolchain and may differ from toolchan to toolchain. * OpenMP offload: - host_device mode: Flang runtime may be built as a fatlib for the host and a set of OpenMP offload architectures. The packaging of the device code is done by the OpenMP offload compiler and may differ from compiler to compiler.
OpenMP offload 'nohost' mode is a TODO to match the build setup of libomptarget/DeviceRTL. Flang runtime will be built as LLVM Bitcode library using Clang/LLVM toolchain. The host part of the library will be "empty", so there will be two distributable object: the host Flang runtime and dummy host library with device Flang runtime pieces packaged using clang-offload-packager and clang.
In all supported modes, enabling parts of Flang runtime for the device compilation can be done iteratively to make the patches observable. Note that at any point in time the resulting library may have unresolved references to not yet enabled parts of Flang runtime.
Example cmake/make commands for building with Clang for NVPTX target: cmake \ -DFLANG_EXPERIMENTAL_CUDA_RUNTIME=ON \ -DCMAKE_CUDA_ARCHITECTURES=80 \ -DCMAKE_C_COMPILER=/clang_nvptx/bin/clang \ -DCMAKE_CXX_COMPILER=/clang_nvptx/bin/clang++ \ -DCMAKE_CUDA_COMPILER=/clang_nvptx/bin/clang \ /llvm-project/flang/runtime/ make -j FortranRuntime
Example cmake/make commands for building with Clang OpenMP offload: cmake \ -DFLANG_EXPERIMENTAL_OMP_OFFLOAD_BUILD="host_device" \ -DCMAKE_C_COMPILER=clang \ -DCMAKE_CXX_COMPILER=clang++ \ -DFLANG_OMP_DEVICE_ARCHITECTURES="sm_80" \ ../flang/runtime/ make -j FortranRuntime
Differential Revision: https://reviews.llvm.org/D151173
show more ...
|
#
bef2bb34 |
| 19-Dec-2022 |
Tarun Prabhu <tarun@lanl.gov> |
[flang] Lowering and runtime support for F08 transformational intrinsics: BESSEL_JN and BESSEL_YN
The runtime implementation uses the recurrence relations
`J(n-1, x) = (2.0 / x) * n * J(n, x) - J(n
[flang] Lowering and runtime support for F08 transformational intrinsics: BESSEL_JN and BESSEL_YN
The runtime implementation uses the recurrence relations
`J(n-1, x) = (2.0 / x) * n * J(n, x) - J(n+1, x)` `Y(n+1, x) = (2.0 / x) * n * Y(n, x) - Y(n-1, x)`
(see https://dlmf.nist.gov/10.74.iv and https://dlmf.nist.gov/10.6.E1).
Although the standard requires that `N1` and `N2` in `BESSEL_JN(N1, N2, x)` and `BESSEL_YN(N1, N2, x)` be non-negative, this is not checked in the runtime functions. This is in keeping with some other compilers which also return some results when `N1` and/or `N2` are negative.
The special case for `x == 0` is handled in different runtime functions for each of `BESSEL_JN` and `BESSEL_YN`. The lowering code checks for this case and inserts the checks and the appropriate runtime calls in FIR.
The existing tests for the two intrinsics was modified to keep the style consistent with the additional lowering tests that were added.
show more ...
|
#
bd577afe |
| 07-Jun-2022 |
Peter Klausler <pklausler@nvidia.com> |
[flang][runtime] Fix runtime CSHIFT of rank>1 array case of negative shift count
The calculation of the source index was incorrect when a CSHIFT shift count value is negative, for the implementation
[flang][runtime] Fix runtime CSHIFT of rank>1 array case of negative shift count
The calculation of the source index was incorrect when a CSHIFT shift count value is negative, for the implementation of CSHIFT for arrays with rank >= 2. (The vector CSHIFT is fine.)
Differential Revision: https://reviews.llvm.org/D127424
show more ...
|
#
d4609ae4 |
| 09-May-2022 |
Peter Steinfeld <psteinfeld@nvidia.com> |
[flang] Change "bad kind" messages in the runtime to "not yet implemented"
Similar to change D125046.
If a programmer is able to compile and link a program that contains types that are not yet supp
[flang] Change "bad kind" messages in the runtime to "not yet implemented"
Similar to change D125046.
If a programmer is able to compile and link a program that contains types that are not yet supported by the runtime, it must be because they're not yet implemented.
This change will make it easier to find unimplemented code in tests.
Differential Revision: https://reviews.llvm.org/D125267
show more ...
|
#
251d062e |
| 15-Mar-2022 |
Peter Klausler <pklausler@nvidia.com> |
[flang] Convert RUNTIME_CHECK to better error for user errors in transformational.cpp
In flang/runtime/transformational.cpp, there are many RUNTIME_CHECK assertions for errors that should have been
[flang] Convert RUNTIME_CHECK to better error for user errors in transformational.cpp
In flang/runtime/transformational.cpp, there are many RUNTIME_CHECK assertions for errors that should have been caught in semantics, but there are alno others that signify program errors that in principle cannot be detected until execution. Convert this second group into readable fatal error messages. Also clean up some missing braces and incorrect printf formats found along the way.
Differential Revision: https://reviews.llvm.org/D122037
show more ...
|
#
e3550f19 |
| 11-Mar-2022 |
Peter Steinfeld <psteinfeld@nvidia.com> |
[flang] Improve runtime crash messages
Where possible, I added additional information to the messages to help programmers figure out what went wrong. I also removed all uses of the word "bad" from
[flang] Improve runtime crash messages
Where possible, I added additional information to the messages to help programmers figure out what went wrong. I also removed all uses of the word "bad" from the messages since (to me) that implies a moral judgement rather than a programming error. I replaced it with either "invalid" or "unsupported" where appropriate.
Differential Revision: https://reviews.llvm.org/D121493
show more ...
|
#
76436336 |
| 23-Feb-2022 |
Peter Klausler <pklausler@nvidia.com> |
[flang] Runtime validation of SPREAD(DIM=dim) argument
Crash when DIM= is not a valid dimension in the result.
Differential Revision: https://reviews.llvm.org/D121145
|
#
bdf57365 |
| 11-Feb-2022 |
Peter Steinfeld <psteinfeld@nvidia.com> |
[flang] Change internal errors in RESHAPE runtime routine to user errors
There are several checks in the runtime routine for the RESHAPE intrinsic. Some checks verify things that should have been c
[flang] Change internal errors in RESHAPE runtime routine to user errors
There are several checks in the runtime routine for the RESHAPE intrinsic. Some checks verify things that should have been checked at compile time while others represent user errors.
This update changes the checks for user errors into calls to "Crash" which include information about the failing check. This identifies them as user errors rather than compiler errors.
I also verified that the checks that remain as internal errors are also checked by the front end. I added a test to the front end's RESHAPE test to complete the checks.
Differential Revision: https://reviews.llvm.org/D119596
show more ...
|
#
77ff6f7d |
| 26-Nov-2021 |
Peter Klausler <pklausler@nvidia.com> |
[flang] Define & implement a lowering support API IsContiguous() in runtime
Create a new flang/runtime/support.cpp module to hold miscellaneous runtime APIs to support lowering, and define an API Is
[flang] Define & implement a lowering support API IsContiguous() in runtime
Create a new flang/runtime/support.cpp module to hold miscellaneous runtime APIs to support lowering, and define an API IsContiguous() to wrap the member function predicate Descriptor::IsContiguous(). And do a little clean-up of other API headers that don't need to expose Runtime/descriptor.h.
Differential Revision: https://reviews.llvm.org/D114752
show more ...
|
#
45a8caf1 |
| 23-Nov-2021 |
Peter Klausler <pklausler@nvidia.com> |
[flang] Fix reversed comparison in RESHAPE() runtime
RESHAPE() fails inappropriately at runtime if the source array is larger than the result -- which is perfectly valid -- because of an obviously r
[flang] Fix reversed comparison in RESHAPE() runtime
RESHAPE() fails inappropriately at runtime if the source array is larger than the result -- which is perfectly valid -- because of an obviously reversed comparison of their numbers of elements is activating the runtime asserts meant for the opposite case (source smaller than result).
Differential Revision: https://reviews.llvm.org/D114474
show more ...
|
#
85ec4493 |
| 10-Nov-2021 |
Peter Klausler <pklausler@nvidia.com> |
[flang] Fix ORDER= argument to RESHAPE
The ORDER= argument to the transformational intrinsic function RESHAPE was being misinterpreted in an inverted way that could be detected only with 3-d or high
[flang] Fix ORDER= argument to RESHAPE
The ORDER= argument to the transformational intrinsic function RESHAPE was being misinterpreted in an inverted way that could be detected only with 3-d or higher rank array. Fix in both folding and the runtime, and extend tests.
Differential Revision: https://reviews.llvm.org/D113699
show more ...
|
#
6544d9a4 |
| 12-Nov-2021 |
Jean Perier <jperier@nvidia.com> |
[flang] Fix vector cshift runtime with non zero lower bounds
The source index should not be compared to zero after applying the shift with the modulo, it must be compared to the lower bound. Otherwi
[flang] Fix vector cshift runtime with non zero lower bounds
The source index should not be compared to zero after applying the shift with the modulo, it must be compared to the lower bound. Otherwise, the extent is not added in case it should and the computed source index may be less than the lower bound, causing invalid results.
Differential Revision: https://reviews.llvm.org/D113659
show more ...
|
#
830c0b90 |
| 01-Sep-2021 |
Peter Klausler <pklausler@nvidia.com> |
[flang] Move runtime API headers to flang/include/flang/Runtime
Move the closure of the subset of flang/runtime/*.h header files that are referenced by source files outside flang/runtime (apart from
[flang] Move runtime API headers to flang/include/flang/Runtime
Move the closure of the subset of flang/runtime/*.h header files that are referenced by source files outside flang/runtime (apart from unit tests) into a new directory (flang/include/flang/Runtime) so that relative include paths into ../runtime need not be used.
flang/runtime/pgmath.h.inc is moved to flang/include/flang/Evaluate; it's not used by the runtime.
Differential Revision: https://reviews.llvm.org/D109107
show more ...
|
#
b8ecdcdd |
| 17-Aug-2021 |
Peter Steinfeld <psteinfeld@nvidia.com> |
[flang] Fix the vector version of EOSHIFT with a BOUNDARY argument
When the vector version of EOSHIFT was called, the BOUNDARY argument was being ignored. I fixed that and added a test that would n
[flang] Fix the vector version of EOSHIFT with a BOUNDARY argument
When the vector version of EOSHIFT was called, the BOUNDARY argument was being ignored. I fixed that and added a test that would not pass without this fix.
Differential Revision: https://reviews.llvm.org/D108249
show more ...
|
#
7898e7c8 |
| 19-Jul-2021 |
Peter Steinfeld <psteinfeld@nvidia.com> |
[flang] Implement the runtime portion of the CSHIFT intrinsic
This change fixes a bug in the runtime portion of the CSHIFT intrinsic that happens when the value of the SHIFT argument is negative.
[flang] Implement the runtime portion of the CSHIFT intrinsic
This change fixes a bug in the runtime portion of the CSHIFT intrinsic that happens when the value of the SHIFT argument is negative.
Differential Revision: https://reviews.llvm.org/D106292
show more ...
|
#
a1034022 |
| 24-Jun-2021 |
Mark Leair <leairmark@gmail.com> |
Change the flang reshape runtime routine interface to use a result argument instead of a result result object.
Change the reshape flang unit test to use the new interface. Also, add an order argumen
Change the flang reshape runtime routine interface to use a result argument instead of a result result object.
Change the reshape flang unit test to use the new interface. Also, add an order argument to exercise the order subscript code in the rehsape runtime routine.
Differential Revision: https://reviews.llvm.org/D104586
show more ...
|