#
ccc31278 |
| 09-Aug-2024 |
Alex MacLean <amaclean@nvidia.com> |
[NVPTX] support switch statement with brx.idx (reland) (#102550)
Add custom lowering for `BR_JT` DAG nodes to the `brx.idx` PTX
instruction ([PTX ISA 9.7.13.4. Control Flow Instructions: brx.idx]
[NVPTX] support switch statement with brx.idx (reland) (#102550)
Add custom lowering for `BR_JT` DAG nodes to the `brx.idx` PTX
instruction ([PTX ISA 9.7.13.4. Control Flow Instructions: brx.idx]
(https://docs.nvidia.com/cuda/parallel-thread-execution/#control-flow-instructions-brx-idx)).
Depending on the heuristics in DAG selection, `switch` statements may
now be lowered using `brx.idx`.
Note: this fixes the previous issue in #102400 by adding the isBarrier
attribute to BRX_END
show more ...
|
#
27568790 |
| 08-Aug-2024 |
Artem Belevich <tra@google.com> |
Revert "[NVPTX] support switch statement with brx.idx" (#102530)
Reverts llvm/llvm-project#102400
Causes LLVM to crash on some tests.
|
#
ba976971 |
| 08-Aug-2024 |
Alex MacLean <amaclean@nvidia.com> |
[NVPTX] support switch statement with brx.idx (#102400)
Add custom lowering for `BR_JT` DAG nodes to the `brx.idx` PTX
instruction ([PTX ISA 9.7.13.4. Control Flow Instructions: brx.idx]
(https://
[NVPTX] support switch statement with brx.idx (#102400)
Add custom lowering for `BR_JT` DAG nodes to the `brx.idx` PTX
instruction ([PTX ISA 9.7.13.4. Control Flow Instructions: brx.idx]
(https://docs.nvidia.com/cuda/parallel-thread-execution/#control-flow-instructions-brx-idx)).
Depending on the heuristics in DAG selection, `switch` statements may
now be lowered using `brx.idx`
show more ...
|
#
0564d066 |
| 06-Aug-2024 |
Nikita Popov <npopov@redhat.com> |
[SDAG] Transfer gep nusw/nuw to SDAG
The resulting add is nuw if either the gep was nuw or it was nusw+nneg. Previously only inbounds+nneg was handled.
Test via wasm load offsets, which seems to mo
[SDAG] Transfer gep nusw/nuw to SDAG
The resulting add is nuw if either the gep was nuw or it was nusw+nneg. Previously only inbounds+nneg was handled.
Test via wasm load offsets, which seems to most directly expose these SDAG flags.
show more ...
|
Revision tags: llvmorg-19.1.0-rc2 |
|
#
da0e66e6 |
| 04-Aug-2024 |
Alexis Engelke <engelke@in.tum.de> |
[CodeGen][NFC] Add wrapper method for MBBMap (#101893)
This is a preparation for changing the data structure of MBBMap.
|
#
ae6dc64e |
| 01-Aug-2024 |
Zequan Wu <zequanwu@google.com> |
Reapply "[Clang] Fix nomerge attribute not working with __builtin_trap(), __debugbreak(), __builtin_verbose_trap() (#101549)"
This reverts commit 667598d84b16d1789ce90b231565e9e7bfdbe77d and fixes f
Reapply "[Clang] Fix nomerge attribute not working with __builtin_trap(), __debugbreak(), __builtin_verbose_trap() (#101549)"
This reverts commit 667598d84b16d1789ce90b231565e9e7bfdbe77d and fixes failed tests: llvm/test/CodeGen/X86/nomerge.ll and llvm/test/MC/AArch64/local-bounds-single-trap.ll.
show more ...
|
#
667598d8 |
| 01-Aug-2024 |
Haowei Wu <haowei@google.com> |
Revert "[Clang] Fix nomerge attribute not working with __builtin_trap(), __debugbreak(), __builtin_verbose_trap() (#101549)"
This reverts commit 5e84646982d1ec9bc94e48dde4b47f03c044a156, which broke
Revert "[Clang] Fix nomerge attribute not working with __builtin_trap(), __debugbreak(), __builtin_verbose_trap() (#101549)"
This reverts commit 5e84646982d1ec9bc94e48dde4b47f03c044a156, which broke 'nomerge.ll' test on llvm bots.
show more ...
|
#
5e846469 |
| 01-Aug-2024 |
Zequan Wu <zequanwu@google.com> |
[Clang] Fix nomerge attribute not working with __builtin_trap(), __debugbreak(), __builtin_verbose_trap() (#101549)
1. It fixes the problem that llvm.trap() not getting the nomerge
attribute.
2. I
[Clang] Fix nomerge attribute not working with __builtin_trap(), __debugbreak(), __builtin_verbose_trap() (#101549)
1. It fixes the problem that llvm.trap() not getting the nomerge
attribute.
2. It sets nomerge flag for the node if the instruction has nomerge
arrtibute.
This is a copy of https://reviews.llvm.org/D146164. This only attempts
to fix `nomerge` for `__builtin_trap()`, `__debugbreak()`,
`__builtin_verbose_trap()`, not working for non-trap builtins.
Fixes #53011
show more ...
|
#
34d48279 |
| 29-Jul-2024 |
Kazu Hirata <kazu@google.com> |
[llvm] Initialize SmallVector with ranges (NFC) (#100948)
|
Revision tags: llvmorg-19.1.0-rc1 |
|
#
6f83a031 |
| 26-Jul-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
CodeGen: Move current call site out of MachineModuleInfo (#100369)
I do not know understand what this is for, but it's only used in SelectionDAGBuilder, so move it to FunctionLoweringInfo like other
CodeGen: Move current call site out of MachineModuleInfo (#100369)
I do not know understand what this is for, but it's only used in SelectionDAGBuilder, so move it to FunctionLoweringInfo like other function scope DAG builder state. The intrinsics are not documented in the LangRef or Intrinsics.td.
This removes the last piece of codegen state from MachineModuleInfo.
show more ...
|
#
455990d1 |
| 24-Jul-2024 |
Vitaly Buka <vitalybuka@google.com> |
Reland "SelectionDAG: Avoid using MachineFunction::getMMI" (#99779)
Reverts llvm/llvm-project#99777
Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>
|
#
b8d2b775 |
| 23-Jul-2024 |
Craig Topper <craig.topper@sifive.com> |
[SelectionDAGBuilder] Avoid const_cast on call to matchSelectPattern. NFC (#100053)
By making the LHS and RHS const pointers, we can use the const signature
of matchSelectPattern.
|
Revision tags: llvmorg-20-init |
|
#
1c798e0b |
| 22-Jul-2024 |
Craig Topper <craig.topper@sifive.com> |
[SelectionDAGBuilder][RISCV] Fix crash when using a memory constraint with scalable vector type. (#99821)
We need to use the minimum size of the scalable type and the correct
stack ID.
The code
[SelectionDAGBuilder][RISCV] Fix crash when using a memory constraint with scalable vector type. (#99821)
We need to use the minimum size of the scalable type and the correct
stack ID.
The code in the PR is still invalid because the instruction used doesn't
have a pointer operand. This is diagnosed later when the assembler
parses it.
Fixes #99782
show more ...
|
#
98c0e55d |
| 20-Jul-2024 |
Vitaly Buka <vitalybuka@google.com> |
Revert "SelectionDAG: Avoid using MachineFunction::getMMI" (#99777)
Reverts llvm/llvm-project#99696
https://lab.llvm.org/buildbot/#/builders/164/builds/1262
|
#
615b7eea |
| 20-Jul-2024 |
Joseph Huber <huberjn@outlook.com> |
Reapply "[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)"
This reverts commit 740161a9b98c9920dedf1852b5f1c94d0a683af5.
I moved the `ISD` dependencies into the CodeGen port
Reapply "[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)"
This reverts commit 740161a9b98c9920dedf1852b5f1c94d0a683af5.
I moved the `ISD` dependencies into the CodeGen portion of the handling, it's a little awkward but it's the easiest solution I can think of for now.
show more ...
|
#
c2019a37 |
| 20-Jul-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
SelectionDAG: Avoid using MachineFunction::getMMI (#99696)
|
#
740161a9 |
| 20-Jul-2024 |
NAKAMURA Takumi <geek4civic@gmail.com> |
Revert "[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)"
This reverts commit c05126bdfc3b02daa37d11056fa43db1a6cdef69. (llvmorg-19-init-17714-gc05126bdfc3b) See #99610
|
#
0f0cfcff |
| 19-Jul-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
CodeGen: Avoid some references to MachineFunction's getMMI (#99652)
MachineFunction's probably should not include a backreference to
the owning MachineModuleInfo. Most of these references were used
CodeGen: Avoid some references to MachineFunction's getMMI (#99652)
MachineFunction's probably should not include a backreference to
the owning MachineModuleInfo. Most of these references were used
just to query the MCContext, which MachineFunction already directly
stores. Other contexts are using it to query the LLVMContext, which
can already be accessed through the IR function reference.
show more ...
|
#
177ce190 |
| 17-Jul-2024 |
Lawrence Benson <github@lawben.com> |
[LLVM] Add `llvm.experimental.vector.compress` intrinsic (#92289)
This PR adds a new vector intrinsic `@llvm.experimental.vector.compress`
to "compress" data within a vector based on a selection ma
[LLVM] Add `llvm.experimental.vector.compress` intrinsic (#92289)
This PR adds a new vector intrinsic `@llvm.experimental.vector.compress`
to "compress" data within a vector based on a selection mask, i.e., it
moves all selected values (i.e., where `mask[i] == 1`) to consecutive
lanes in the result vector. A `passthru` vector can be provided, from
which remaining lanes are filled.
The main reason for this is that the existing
`@llvm.masked.compressstore` has very strong constraints in that it can
only write values that were selected, resulting in guard branches for
all targets except AVX-512 (and even there the AMD implementation is
_very_ slow). More instruction sets support "compress" logic, but only
within registers. So to store the values, an additional store is needed.
But this combination is likely significantly faster on many target as it
avoids branches.
In follow up PRs, my plan is to add target-specific lowerings for x86,
SVE, and possibly RISCV. I also want to combine this with a store
instruction, as this is probably a common case and we can avoid some
memory writes in that case.
See [discussion in
forum](https://discourse.llvm.org/t/new-intrinsic-for-masked-vector-compress-without-store/78663)
for initial discussion on the design.
show more ...
|
#
f270a4dd |
| 17-Jul-2024 |
Amara Emerson <amara@apple.com> |
[AArch64] Don't tail call memset if it would convert to a bzero. (#98969)
Well, not quite that simple. We can tc memset since it returns the first
argument but bzero doesn't do that and therefore w
[AArch64] Don't tail call memset if it would convert to a bzero. (#98969)
Well, not quite that simple. We can tc memset since it returns the first
argument but bzero doesn't do that and therefore we can end up
miscompiling.
This patch also refactors the logic out of isInTailCallPosition() into the callers.
As a result memcpy and memmove are also modified to do the same thing
for consistency.
rdar://131419786
show more ...
|
#
c05126bd |
| 16-Jul-2024 |
Joseph Huber <huberjn@outlook.com> |
[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)
Summary: The LTO pass and LLD linker have logic in them that forces extraction and prevent internalization of needed runtime
[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)
Summary: The LTO pass and LLD linker have logic in them that forces extraction and prevent internalization of needed runtime calls. However, these currently take all RTLibcalls into account, even if the target does not support them. The target opts-out of a libcall if it sets its name to nullptr. This patch pulls this logic out into a class in the header so that LTO / lld can use it to determine if a symbol actually needs to be kept.
This is important for targets like AMDGPU that want to be able to use `lld` to perform the final link step, but does not want the overhead of uncalled functions. (This adds like a second to the link time trivially)
show more ...
|
#
d286efeb |
| 15-Jul-2024 |
Ahmed Bougacha <ahmed@bougacha.org> |
[AArch64][PAC] Lower direct authenticated calls to ptrauth constants. (#97664)
This tries to turn indirect ptrauth calls into direct calls, using
`ConstantPtrAuth::isKnownEquivalent` to compare the
[AArch64][PAC] Lower direct authenticated calls to ptrauth constants. (#97664)
This tries to turn indirect ptrauth calls into direct calls, using
`ConstantPtrAuth::isKnownEquivalent` to compare the `ConstantPtrAuth`
target with the ptrauth call bundle.
This should be straightforward, other than the somewhat awkward GISel
handling, which has a handshake between CallLowering and IRTranslator to
elide the ptrauth when possible.
show more ...
|
#
0b58f34c |
| 11-Jul-2024 |
Farzon Lotfi <1802579+farzonl@users.noreply.github.com> |
[X86][CodeGen] Add base trig intrinsic lowerings (#96222)
This change is an implementation of
https://github.com/llvm/llvm-project/issues/87367's investigation on
supporting IEEE math operations a
[X86][CodeGen] Add base trig intrinsic lowerings (#96222)
This change is an implementation of
https://github.com/llvm/llvm-project/issues/87367's investigation on
supporting IEEE math operations as intrinsics.
Which was discussed in this RFC:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294
This change adds constraint intrinsics and some lowering cases for
`acos`, `asin`, `atan`, `cosh`, `sinh`, and `tanh`.
The only x86 specific change was for f80.
https://github.com/llvm/llvm-project/issues/70079
https://github.com/llvm/llvm-project/issues/70080
https://github.com/llvm/llvm-project/issues/70081
https://github.com/llvm/llvm-project/issues/70083
https://github.com/llvm/llvm-project/issues/70084
https://github.com/llvm/llvm-project/issues/95966
The x86 lowering is going to be done in three pr changes with this being
the first.
A second PR will be put up for Loop Vectorizing and then SLPVectorizer.
The constraint intrinsics is also going to be in multiple parts, but
just 2.
This part covers just the llvm specific changes, part2 will cover clang
specifc changes and legalization for backends than have special
legalization
requirements like aarch64 and wasm.
show more ...
|
#
1782810b |
| 10-Jul-2024 |
Daniel Kiss <daniel.kiss@arm.com> |
[Clang][ARM][AArch64] Alway emit protection attributes for functions. (#82819)
So far branch protection, sign return address, guarded control stack attributes are only emitted as module flags to ind
[Clang][ARM][AArch64] Alway emit protection attributes for functions. (#82819)
So far branch protection, sign return address, guarded control stack attributes are only emitted as module flags to indicate the functions need to be generated with those features. The problem is in case of an LTO build the module flags are merged with the `min` rule which means if one of the module is not build with sign return address then the features will be turned off for all functions. Due to the functions take the branch-protection and sign-return-address features from the module flags. The sign-return-address is function level option therefore it is expected functions from files that is compiled with -mbranch-protection=pac-ret to be protected. The inliner might inline functions with different set of flags as it doesn't consider the module flags.
This patch adds the attributes to all functions and drops the checking of the module flags for the code generation. Module flag is still used for generating the ELF markers. Also drops the "true"/"false" values from the branch-protection-enforcement, branch-protection-pauth-lr, guarded-control-stack attributes as presence of the attribute means it is on absence means off and no other option.
Releand with test fixes.
show more ...
|
#
4b2daecc |
| 10-Jul-2024 |
Daniel Kiss <daniel.kiss@arm.com> |
Revert "[Clang][ARM][AArch64] Alway emit protection attributes for functions." (#98284)
Reverts llvm/llvm-project#82819
|