#
cc548ec4 |
| 31-May-2024 |
Ahmed Bougacha <ahmed@bougacha.org> |
[AArch64][PAC] Lower authenticated calls with ptrauth bundles. (#85736)
This adds codegen support for the "ptrauth" operand bundles, which can
be used to augment indirect calls with the equivalent
[AArch64][PAC] Lower authenticated calls with ptrauth bundles. (#85736)
This adds codegen support for the "ptrauth" operand bundles, which can
be used to augment indirect calls with the equivalent of an
`@llvm.ptrauth.auth` intrinsic call on the call target (possibly
preceded by an `@llvm.ptrauth.blend` on the auth discriminator if
applicable.)
This allows the generation of combined authenticating calls
on AArch64 (in the BLRA* PAuth instructions), while avoiding
the raw just-authenticated function pointer from being
exposed to attackers.
This is done by threading a PtrAuthInfo descriptor through
the call lowering infrastructure, eventually selecting a BLRA
pseudo. The pseudo encapsulates the safe discriminator
computation, which together with the real BLRA* call get emitted
in late pseudo expansion in AsmPrinter.
Note that this also applies to the other forms of indirect calls,
notably invokes, rvmarker, and tail calls. Tail-calls in particular
bring some additional complexity, with the intersecting register
constraints of BTI and PAC discriminator computation.
However this doesn't currently support PAuth_LR tail-call variants.
This also adopts an x8+ allocation order for GPR64noip, matching
GPR64.
show more ...
|
#
05e6bb40 |
| 30-May-2024 |
Roger Ferrer Ibáñez <rofirrim@gmail.com> |
[SelectionDAG] Add an ISD::CLEAR_CACHE node to lower llvm.clear_cache (#93795)
The current way of lowering `llvm.clear_cache` is a bit unusual. As
suggested by Matt Arsenault we are better off usin
[SelectionDAG] Add an ISD::CLEAR_CACHE node to lower llvm.clear_cache (#93795)
The current way of lowering `llvm.clear_cache` is a bit unusual. As
suggested by Matt Arsenault we are better off using an ISD node.
This change introduces a new `ISD::CLEAR_CACHE`, registers a new libcall
by default named `__clear_cache` and the default legalisation is a
libcall.
This is preparatory work for a custom lowering of `ISD::CLEAR_CACHE`
needed by RISC-V on some platforms.
show more ...
|
Revision tags: llvmorg-18.1.6 |
|
#
fbb37e96 |
| 13-May-2024 |
Graham Hunter <graham.hunter@arm.com> |
[AArch64] Add an all-in-one histogram intrinsic
Based on discussion from
https://discourse.llvm.org/t/rfc-vectorization-support-for-histogram-count-operations/74788
Current interface is:
llvm
[AArch64] Add an all-in-one histogram intrinsic
Based on discussion from
https://discourse.llvm.org/t/rfc-vectorization-support-for-histogram-count-operations/74788
Current interface is:
llvm.experimental.histogram(<vecty> ptrs, <intty> inc_amount, <vecty> mask)
The integer type used by 'inc_amount' needs to match the type of the buckets in memory.
The intrinsic covers the following operations:
* Gather load
* histogram on the elements of 'ptrs'
* multiply the histogram results by 'inc_amount'
* add the result of the multiply to the values loaded by the gather
* scatter store the results of the add
Supports lowering to histcnt instructions for AArch64 targets, and scalarization for all others at present.
show more ...
|
#
b277bf56 |
| 10-May-2024 |
Paul Walker <paul.walker@arm.com> |
[LLVM][CodeGen][SVE] Clean up lowering of VECTOR_SPLICE operations. (#91330)
Remove DAG combine that is performing type legalisation and instead
add isel patterns for all legal types.
|
#
df21ee4c |
| 09-May-2024 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[DAG] Add clang-format off/on wrappers around compact switch handlers. NFC.
Avoids a problem identified in #90503
|
#
b52fa946 |
| 09-May-2024 |
David Sherwood <57997763+david-arm@users.noreply.github.com> |
[Analysis] Add cost model for experimental.cttz.elts intrinsic (#90720)
In PR #88385 I've added support for auto-vectorisation of some early
exit loops, which requires using the experimental.cttz.e
[Analysis] Add cost model for experimental.cttz.elts intrinsic (#90720)
In PR #88385 I've added support for auto-vectorisation of some early
exit loops, which requires using the experimental.cttz.elts to calculate
final indices in the early exit block. We need a more accurate cost
model for this intrinsic to better reflect the cost of work required in
the early exit block. I've tried to accurately represent the expansion
code for the intrinsic when the target does not have efficient lowering
for it. It's quite tricky to model because you need to first figure out
what types will actually be used in the expansion. The type used can
have a significant effect on the cost if you end up using illegal vector
types.
Tests added here:
Analysis/CostModel/AArch64/cttz_elts.ll
Analysis/CostModel/RISCV/cttz_elts.ll
show more ...
|
#
df311a27 |
| 08-May-2024 |
Aleksandr Popov <42888396+aleks-tmb@users.noreply.github.com> |
Add interface to check if a call has a deopt bundle (NFC) (#91348)
Encapsulate check that a call has a deopt bundle to make it easier to
change the deopt scheme.
|
#
235cea72 |
| 07-May-2024 |
Paul Walker <paul.walker@arm.com> |
[NFC][LLVM] Refactor rounding mode detection of constrained fp intrinsic IDs (#90854)
I've refactored the code to genericise the implementation to better
allow for target specific constrained fp in
[NFC][LLVM] Refactor rounding mode detection of constrained fp intrinsic IDs (#90854)
I've refactored the code to genericise the implementation to better
allow for target specific constrained fp intrinsics.
show more ...
|
Revision tags: llvmorg-18.1.5 |
|
#
539f626e |
| 30-Apr-2024 |
Min-Yih Hsu <min.hsu@sifive.com> |
[VP][RISCV] Add vp.cttz.elts intrinsic and its RISC-V codegen (#90502)
This intrinsic is the VP version of `experimental.cttz.elts`.
|
#
bfc03171 |
| 29-Apr-2024 |
Maciej Gabka <maciej.gabka@arm.com> |
Move several vector intrinsics out of experimental namespace (#88748)
This patch is moving out following intrinsics:
* vector.interleave2/deinterleave2
* vector.reverse
* vector.splice
from th
Move several vector intrinsics out of experimental namespace (#88748)
This patch is moving out following intrinsics:
* vector.interleave2/deinterleave2
* vector.reverse
* vector.splice
from the experimental namespace.
All these intrinsics exist in LLVM for more than a year now, and are
widely used, so should not be considered as experimental.
show more ...
|
#
cf328ff9 |
| 24-Apr-2024 |
Pierre van Houtryve <pierre.vanhoutryve@amd.com> |
[IR] Memory Model Relaxation Annotations (#78569)
Implements the core/target-agnostic components of Memory Model
Relaxation Annotations.
RFC:
https://discourse.llvm.org/t/rfc-mmras-memory-model
[IR] Memory Model Relaxation Annotations (#78569)
Implements the core/target-agnostic components of Memory Model
Relaxation Annotations.
RFC:
https://discourse.llvm.org/t/rfc-mmras-memory-model-relaxation-annotations/76361/5
show more ...
|
#
d8b253be |
| 23-Apr-2024 |
Björn Pettersson <bjorn.a.pettersson@ericsson.com> |
[SelectionDAG] Mark frame index as "aliased" at argument copy elison (#89712)
This is a fix for miscompiles reported in
https://github.com/llvm/llvm-project/issues/89060
After argument copy el
[SelectionDAG] Mark frame index as "aliased" at argument copy elison (#89712)
This is a fix for miscompiles reported in
https://github.com/llvm/llvm-project/issues/89060
After argument copy elison the IR value for the eliminated alloca
is aliasing with the fixed stack object. This patch is making sure
that we mark the fixed stack object as being aliased with IR values
to avoid that for example schedulers are reordering accesses to
the fixed stack object. This could otherwise happen when there is a
mix of MemOperands refering the shared fixed stack slow via both
the IR value for the elided alloca, and via a fixed stack pseudo
source value (as would be the case when lowering the arguments).
show more ...
|
Revision tags: llvmorg-18.1.4, llvmorg-18.1.3 |
|
#
70136389 |
| 27-Mar-2024 |
Noah Goldstein <goldstein.w.n@gmail.com> |
[DAG] Add support for `nneg` flag with `uitofp`
Copy `nneg` flag when building `UINT_TO_FP` from `uitofp` and use `nneg` flag in the one place we transform `UINT_TO_FP` -> `SINT_TO_FP` if the operan
[DAG] Add support for `nneg` flag with `uitofp`
Copy `nneg` flag when building `UINT_TO_FP` from `uitofp` and use `nneg` flag in the one place we transform `UINT_TO_FP` -> `SINT_TO_FP` if the operand is non-negative.
show more ...
|
#
9fd2e2c2 |
| 08-Apr-2024 |
David Green <david.green@arm.com> |
[DAG][AArch64] Support masked loads/stores with nontemporal flags (#87608)
SVE has some non-temporal masked loads and stores. The metadata coming
from the nodes is not copied to the MMO at the mome
[DAG][AArch64] Support masked loads/stores with nontemporal flags (#87608)
SVE has some non-temporal masked loads and stores. The metadata coming
from the nodes is not copied to the MMO at the moment though, meaning it
will generate a normal instruction. This patch ensures that the right
flags are set if the instruction has non-temporal metadata.
show more ...
|
#
42155797 |
| 01-Apr-2024 |
Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com> |
[AMDGPU] Use glue for convergence tokens at call-like operations (#86766)
The earlier implementation on AMDGPU used explicit token operands at
SI_CALL and SI_CALL_ISEL. This is now replaced with CO
[AMDGPU] Use glue for convergence tokens at call-like operations (#86766)
The earlier implementation on AMDGPU used explicit token operands at
SI_CALL and SI_CALL_ISEL. This is now replaced with CONVERGENCECTRL_GLUE
operands, with the following effects:
- The treatment of tokens at call-like operations is now consistent with
the treatment at intrinsics.
- Support for tail calls using implicit tokens at SI_TCRETURN "just
works".
- The extra parameter at call-like instructions is eliminated, thus
restoring those instructions and their handling to the original state.
The new glue node is placed after the existing glue node for the
outgoing call parameters, which seems to not interfere with selection of
the call-like nodes.
show more ...
|
#
20f56e1f |
| 01-Apr-2024 |
Vitaly Buka <vitalybuka@google.com> |
[CodeGen] Add default lowering for llvm.allow.{runtime,ubsan}.check() (#86049)
RFC:
https://discourse.llvm.org/t/rfc-add-llvm-experimental-hot-intrinsic-or-llvm-hot/77641
|
#
0e5c504d |
| 26-Mar-2024 |
Emil Pedersen <3mille.prenom.nom@gmail.com> |
[DebugInfo] [SelectionDAG] Fix handling of duplicate dbg values (#86598)
Before this fix, a duplicate llvm.dbg.value intrinsic referring to an
argument, after an alloca, would be generated with `$n
[DebugInfo] [SelectionDAG] Fix handling of duplicate dbg values (#86598)
Before this fix, a duplicate llvm.dbg.value intrinsic referring to an
argument, after an alloca, would be generated with `$noreg`, losing
debug information. Instead, we silently drop the second debug info, so
it doesn't break the first one.
rdar://125375717
show more ...
|
#
308ed023 |
| 26-Mar-2024 |
Il-Capitano <52455591+Il-Capitano@users.noreply.github.com> |
[Intrinsics] Make `patchpoint.i64` generic on its return type (#85911)
Currently patchpoints can only have two result types, `void` and `i64`.
This limits the result to general purpose registers.
[Intrinsics] Make `patchpoint.i64` generic on its return type (#85911)
Currently patchpoints can only have two result types, `void` and `i64`.
This limits the result to general purpose registers.
This patch makes `patchpoint.i64` an overloadable intrinsic, allowing
result values that can fit in a single register (e.g. integers,
pointers, floats).
show more ...
|
#
57146dae |
| 23-Mar-2024 |
Harvin Iriawan <25712785+harviniriawan@users.noreply.github.com> |
[CodeGen] Update for scalable MemoryType in MMO (#70452)
Remove getSizeOrUnknown call when MachineMemOperand is created. For Scalable
TypeSize, the MemoryType created becomes a scalable_vector.
[CodeGen] Update for scalable MemoryType in MMO (#70452)
Remove getSizeOrUnknown call when MachineMemOperand is created. For Scalable
TypeSize, the MemoryType created becomes a scalable_vector.
2 MMOs that have scalable memory access can then use the updated BasicAA that
understands scalable LocationSize.
Original Patch by Harvin Iriawan
Co-authored-by: David Green <david.green@arm.com>
show more ...
|
#
c67ed2f1 |
| 22-Mar-2024 |
Craig Topper <craig.topper@sifive.com> |
[SelectionDAG][RISCV] Use TypeSize version of ComputeValueVTs in TargetLowering::LowerCallTo. (#86166)
This is needed to support non-intrinsic functions returning tuple types
which are represented
[SelectionDAG][RISCV] Use TypeSize version of ComputeValueVTs in TargetLowering::LowerCallTo. (#86166)
This is needed to support non-intrinsic functions returning tuple types
which are represented as structs with scalable vector types in IR.
I suspect this may have been broken since
https://reviews.llvm.org/D158115
show more ...
|
#
bdc77d1e |
| 20-Mar-2024 |
Stephen Tozer <stephen.tozer@sony.com> |
[RemoveDIs][NFC] Rename DPLabel->DbgLabelRecord (#85918)
This patch renames DPLabel to DbgLabelRecord, in accordance with the
ongoing DbgRecord rename. This rename was fairly trivial, since DPLabel
[RemoveDIs][NFC] Rename DPLabel->DbgLabelRecord (#85918)
This patch renames DPLabel to DbgLabelRecord, in accordance with the
ongoing DbgRecord rename. This rename was fairly trivial, since DPLabel
isn't as widely used as DPValue and has no real conflicts in either its
full or abbreviated name. As usual, the entire replacement was done
automatically, with `s/DPLabel/DbgLabelRecord/` and `s/DPL/DLR/`.
show more ...
|
Revision tags: llvmorg-18.1.2 |
|
#
ffd08c77 |
| 19-Mar-2024 |
Stephen Tozer <stephen.tozer@sony.com> |
[RemoveDIs][NFC] Rename DPValue -> DbgVariableRecord (#85216)
This is the major rename patch that prior patches have built towards.
The DPValue class is being renamed to DbgVariableRecord, which re
[RemoveDIs][NFC] Rename DPValue -> DbgVariableRecord (#85216)
This is the major rename patch that prior patches have built towards.
The DPValue class is being renamed to DbgVariableRecord, which reflects
the updated terminology for the "final" implementation of the RemoveDI
feature. This is a pure string substitution + clang-format patch. The
only manual component of this patch was determining where to perform
these string substitutions: `DPValue` and `DPV` are almost exclusively
used for DbgRecords, *except* for:
- llvm/lib/target, where 'DP' is used to mean double-precision, and so
appears as part of .td files and in variable names. NB: There is a
single existing use of `DPValue` here that refers to debug info, which
I've manually updated.
- llvm/tools/gold, where 'LDPV' is used as a prefix for symbol
visibility enums.
Outside of these places, I've applied several basic string
substitutions, with the intent that they only affect DbgRecord-related
identifiers; I've checked them as I went through to verify this, with
reasonable confidence that there are no unintended changes that slipped
through the cracks. The substitutions applied are all case-sensitive,
and are applied in the order shown:
```
DPValue -> DbgVariableRecord
DPVal -> DbgVarRec
DPV -> DVR
```
Following the previous rename patches, it should be the case that there
are no instances of any of these strings that are meant to refer to the
general case of DbgRecords, or anything other than the DPValue class.
The idea behind this patch is therefore that pure string substitution is
correct in all cases as long as these assumptions hold.
show more ...
|
#
18da51b2 |
| 18-Mar-2024 |
David Green <david.green@arm.com> |
[CodeGen] More uses of LocationSize::beforeOrAfterPointer().
As an extension to #84751, this adds some extra uses of beforeOrAfterPointer() instead of UnknownSize.
|
#
601e102b |
| 17-Mar-2024 |
David Green <david.green@arm.com> |
[CodeGen] Use LocationSize for MMO getSize (#84751)
This is part of #70452 that changes the type used for the external
interface of MMO to LocationSize as opposed to uint64_t. This means the
const
[CodeGen] Use LocationSize for MMO getSize (#84751)
This is part of #70452 that changes the type used for the external
interface of MMO to LocationSize as opposed to uint64_t. This means the
constructors take LocationSize, and convert ~UINT64_C(0) to
LocationSize::beforeOrAfter(). The getSize methods return a
LocationSize.
This allows us to be more precise with unknown sizes, not accidentally
treating them as unsigned values, and in the future should allow us to
add proper scalable vector support but none of that is included in this
patch. It should mostly be an NFC.
Global ISel is still expected to use the underlying LLT as it needs, and
are not expected to see unknown sizes for generic operations. Most of
the changes are hopefully fairly mechanical, adding a lot of getValue()
calls and protecting them with hasValue() where needed.
show more ...
|
#
15f3f446 |
| 12-Mar-2024 |
Stephen Tozer <stephen.tozer@sony.com> |
[RemoveDIs][NFC] Rename common interface functions for DPValues->DbgRecords (#84793)
As part of the effort to rename the DbgRecord classes, this patch
renames the widely-used functions that operate
[RemoveDIs][NFC] Rename common interface functions for DPValues->DbgRecords (#84793)
As part of the effort to rename the DbgRecord classes, this patch
renames the widely-used functions that operate on DbgRecords but refer
to DbgValues or DPValues in their names to refer to DbgRecords instead;
all such functions are defined in one of `BasicBlock.h`,
`Instruction.h`, and `DebugProgramInstruction.h`.
This patch explicitly does not change the names of any comments or
variables, except for where they use the exact name of one of the
renamed functions. The reason for this is reviewability; this patch can
be trivially examined to determine that the only changes are direct
string substitutions and any results from clang-format responding to the
changed line lengths. Future patches will cover renaming variables and
comments, and then renaming the classes themselves.
show more ...
|