#
cf5ecd56 |
| 06-Mar-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
GlobalISel: Fix off by one in finding explicit byval alignment
For attribute sets, the return index is at 0, and arguments start at 1. getParamAlignment adds the offset of 1, so we need to convert f
GlobalISel: Fix off by one in finding explicit byval alignment
For attribute sets, the return index is at 0, and arguments start at 1. getParamAlignment adds the offset of 1, so we need to convert from attribute index back to IR index.
show more ...
|
#
78dcff48 |
| 02-Mar-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
GlobalISel: Add default implementation of assignValueToReg
Refactor insertion of the asserting ops. This enables using them for AMDGPU.
This code should essentially be the same for every target. Mi
GlobalISel: Add default implementation of assignValueToReg
Refactor insertion of the asserting ops. This enables using them for AMDGPU.
This code should essentially be the same for every target. Mips, X86 and ARM all have different code there now, but this seems to be an accident. The assignment functions are called with different types than they would be in the DAG, so this is all likely an assortment of hacks to get around that.
show more ...
|
#
fd82cbcf |
| 09-Feb-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
GlobalISel: Merge and cleanup more AMDGPU call lowering code
This merges more AMDGPU ABI lowering code into the generic call lowering. Start cleaning up by factoring away more of the pack/unpack log
GlobalISel: Merge and cleanup more AMDGPU call lowering code
This merges more AMDGPU ABI lowering code into the generic call lowering. Start cleaning up by factoring away more of the pack/unpack logic into the buildCopy{To|From}Parts functions. These could use more improvement, and the SelectionDAG versions are significantly more complex, and we'll eventually have to emulate all of those cases too.
This is mostly NFC, but does result in some minor instruction reordering. It also removes some of the limitations with mismatched sizes the old code had. However, similarly to the merge on the input, this is forcing gfx6/gfx7 to use the gfx8+ ABI (which is what we actually want, but SelectionDAG is stuck using the weird emergent ABI).
This also changes the load/store size for stack passed EVTs for AArch64, which makes it consistent with the DAG behavior.
show more ...
|
#
01314984 |
| 01-Mar-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
GlobalISel: Remove dead code
Generic code should probably not introduce G_INSERT/G_EXTRACT. The mirror unpackRegs should also be removed, but AMDGPU still has a use remaining which needs to be fixed.
|
#
6c260d3b |
| 28-Feb-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
GlobalISel: Move splitToValueTypes to generic code
I copied the nearly identical function from AArch64 into AMDGPU, so fix this duplication.
Mips and X86 have their own more exotic versions which s
GlobalISel: Move splitToValueTypes to generic code
I copied the nearly identical function from AArch64 into AMDGPU, so fix this duplication.
Mips and X86 have their own more exotic versions which should be removed. However replacing those is better left for a separate patch since it requires other changes to avoid regressions.
show more ...
|
#
212d6a95 |
| 19-Feb-2021 |
Amara Emerson <amara@apple.com> |
[GloblalISel] Support lowering <3 x i8> arguments in multiple parts.
Differential Revision: https://reviews.llvm.org/D97086
|
#
69ce291b |
| 19-Feb-2021 |
Amara Emerson <amara@apple.com> |
[AArch64][GlobalISel] Support lowering <1 x i8> arguments.
We don't yet have working codegen for the resulting unmerges, and if we did it would probably be horrible.
Differential Revision: https://
[AArch64][GlobalISel] Support lowering <1 x i8> arguments.
We don't yet have working codegen for the resulting unmerges, and if we did it would probably be horrible.
Differential Revision: https://reviews.llvm.org/D97035
show more ...
|
#
62d946e1 |
| 07-Feb-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
GlobalISel: Merge some AMDGPU ABI lowering code to generic code
AMDGPU currently has a lot of pre-processing code to pre-split argument types into 32-bit pieces before passing it to the generic code
GlobalISel: Merge some AMDGPU ABI lowering code to generic code
AMDGPU currently has a lot of pre-processing code to pre-split argument types into 32-bit pieces before passing it to the generic code in handleAssignments. This is a bit sloppy and also requires some overly fancy iterator work when building the calls. It's better if all argument marshalling code is handled directly in handleAssignments. This handles more situations like decomposing large element vectors into sub-element sized pieces.
This should mostly be NFC, but does change the generated code by shifting where the initial argument packing instructions are placed. I think this is nicer looking, since it now emits the packing code directly after the relevant copies, rather than after the copies for the remaining arguments.
This doubles down on gfx6/gfx7 using the gfx8+ ABI for 16-bit types. This is ultimately the better option, but incompatible with the DAG. Fixing this requires more work, especially for f16.
show more ...
|
#
392e0fcf |
| 07-Feb-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
GlobalISel: Handle arguments partially passed on the stack
The API is a bit awkward since you need to index into an array in the passed struct. I guess an alternative would be to pass all of the ind
GlobalISel: Handle arguments partially passed on the stack
The API is a bit awkward since you need to index into an array in the passed struct. I guess an alternative would be to pass all of the individual fields.
show more ...
|
#
b72a2365 |
| 08-Feb-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
GlobalISel: Fix using wrong calling convention for callees
This was taking the calling convention from the parent function, instead of the callee. Avoids regressions in a future patch when the calle
GlobalISel: Fix using wrong calling convention for callees
This was taking the calling convention from the parent function, instead of the callee. Avoids regressions in a future patch when the caller and callee have different type breakdowns.
For some reason AArch64's lowerFormalArguments seems to intentionally ignore the parent isVarArg.
show more ...
|
#
87e28011 |
| 08-Feb-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
GlobalISel: Use correct calling convention in handleAssignments
This was using the calling convention of the calling function, not the callee. Avoids regressions in a future patch.
|
#
ec41ed5b |
| 03-Feb-2021 |
Amara Emerson <amara@apple.com> |
[AArch64][GlobalISel] Support the 'returned' parameter attribute.
On AArch64 (which seems to be the only target that supports it), this attribute allows codegen to avoid saving/restoring the value i
[AArch64][GlobalISel] Support the 'returned' parameter attribute.
On AArch64 (which seems to be the only target that supports it), this attribute allows codegen to avoid saving/restoring the value in x0 across a call.
Gives a 0.1% geomean -Os code size improvement on CTMark.
Differential Revision: https://reviews.llvm.org/D96099
show more ...
|
#
35c535a7 |
| 12-Jan-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AArch64/GlobalISel: Factor out parametersInCSRMatch
Make this look more like the DAG handling and move to common code.
I also noticed AArch64 seems to not be properly adding the physreg:virtreg map
AArch64/GlobalISel: Factor out parametersInCSRMatch
Make this look more like the DAG handling and move to common code.
I also noticed AArch64 seems to not be properly adding the physreg:virtreg mapping to the function live ins.
show more ...
|
Revision tags: llvmorg-11.1.0-rc1 |
|
#
ae25a397 |
| 06-Jan-2021 |
Christudasan Devadasan <Christudasan.Devadasan@amd.com> |
AMDGPU/GlobalISel: Enable sret demotion
|
#
d68458bd |
| 23-Dec-2020 |
Christudasan Devadasan <Christudasan.Devadasan@amd.com> |
[GlobalISel] Base implementation for sret demotion.
If the return values can't be lowered to registers SelectionDAG performs the sret demotion. This patch contains the basic implementation for the s
[GlobalISel] Base implementation for sret demotion.
If the return values can't be lowered to registers SelectionDAG performs the sret demotion. This patch contains the basic implementation for the same in the GlobalISel pipeline.
Furthermore, targets should bring relevant changes during lowerFormalArguments, lowerReturn and lowerCall to make use of this feature.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D92953
show more ...
|
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2 |
|
#
e7e7d371 |
| 15-Dec-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
GlobalISel: Fix generic handling of single outgoing call arguments
Simply call the argument handler like is done for the incoming case. This will allow removal of hacks in the AMDGPU call lowering i
GlobalISel: Fix generic handling of single outgoing call arguments
Simply call the argument handler like is done for the incoming case. This will allow removal of hacks in the AMDGPU call lowering in a future change.
show more ...
|
Revision tags: llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5 |
|
#
35a531fb |
| 29-Sep-2020 |
David Sherwood <david.sherwood@arm.com> |
[SVE][CodeGen][NFC] Replace TypeSize comparison operators with their scalar equivalents
In certain places in llvm/lib/CodeGen we were relying upon the TypeSize comparison operators when in fact the
[SVE][CodeGen][NFC] Replace TypeSize comparison operators with their scalar equivalents
In certain places in llvm/lib/CodeGen we were relying upon the TypeSize comparison operators when in fact the code was only ever expecting either scalar values or fixed width vectors. I've changed some of these places to use the equivalent scalar operator.
Differential Revision: https://reviews.llvm.org/D88482
show more ...
|
#
43d239d0 |
| 30-Sep-2020 |
Gabriel Hjort Åkerlund <gabriel.hjort.akerlund@ericsson.com> |
[GlobalISel] Fix incorrect setting of ValNo when splitting
Before, for each original argument i, ValNo was set to i + PartIdx, but ValNo is intended to reflect the index of the value before splittin
[GlobalISel] Fix incorrect setting of ValNo when splitting
Before, for each original argument i, ValNo was set to i + PartIdx, but ValNo is intended to reflect the index of the value before splitting. Hence, ValNo should always be set to i and not consider the PartIdx.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D86511
show more ...
|
Revision tags: llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2 |
|
#
bf36e902 |
| 18-Aug-2020 |
Jessica Paquette <jpaquette@apple.com> |
[GlobalISel][CallLowering] NFC: Unify flag-setting from CallBase + AttributeList
It's annoying to have to maintain multiple, nearly identical chains of if statements which all set the same attribute
[GlobalISel][CallLowering] NFC: Unify flag-setting from CallBase + AttributeList
It's annoying to have to maintain multiple, nearly identical chains of if statements which all set the same attributes.
Add a helper function, `addFlagsUsingAttrFn` which performs the attribute setting.
Then, use wrappers for that function in `lowerCall` and `setArgFlags`.
(Note that the flag-setting code in `setArgFlags` was missing the returned attribute. There's no selection for this yet, so no test. It's an example of the kind of thing this lets us avoid, though.)
Differential Revision: https://reviews.llvm.org/D86159
show more ...
|
#
f29e6277 |
| 18-Aug-2020 |
Jessica Paquette <jpaquette@apple.com> |
[GlobalISel][CallLowering] Don't tail call with non-forwarded explicit sret
Similar to this commit:
faf8065a99817bcb10e6f09b558fe3e0972c35ce
Testcase is pretty much the same as
test/CodeGen/AArch
[GlobalISel][CallLowering] Don't tail call with non-forwarded explicit sret
Similar to this commit:
faf8065a99817bcb10e6f09b558fe3e0972c35ce
Testcase is pretty much the same as
test/CodeGen/AArch64/tailcall-explicit-sret.ll
Except it uses i64 (since we don't handle the i1024 return values yet), and doesn't have indirect tail call testcases (because we can't translate those yet).
Differential Revision: https://reviews.llvm.org/D86148
show more ...
|
#
224a8c63 |
| 17-Aug-2020 |
Jessica Paquette <jpaquette@apple.com> |
[GlobalISel][CallLowering] Look through call parameters for flags
We weren't looking through the parameters on calls at all.
E.g., say you had
``` declare i32 @zext(i32 zeroext %x)
... %y = call
[GlobalISel][CallLowering] Look through call parameters for flags
We weren't looking through the parameters on calls at all.
E.g., say you had
``` declare i32 @zext(i32 zeroext %x)
... %y = call i32 @zext(i32 %something) ...
```
At the point of the call, we wouldn't know that the %something should have the zeroext attribute.
This sets flags in about the same way as TargetLoweringBase::ArgListEntry::setAttributes.
Differential Revision: https://reviews.llvm.org/D86125
show more ...
|
#
a275acc4 |
| 17-Aug-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
GlobalISel: Early continue to reduce loop indentation
|
Revision tags: llvmorg-11.0.0-rc1, llvmorg-12-init |
|
#
b98f902f |
| 08-Jul-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
GlobalISel: Restructure argument lowering loop in handleAssignments
This was structured in a way that implied every split argument is in memory, or in registers. It is possible to pass an original a
GlobalISel: Restructure argument lowering loop in handleAssignments
This was structured in a way that implied every split argument is in memory, or in registers. It is possible to pass an original argument partially in registers, and partially in memory. Transpose the logic here to only consider a single piece at a time. Every individual CCValAssign should be treated independently, and any merge to original value needs to be handled later.
This is in preparation for merging some preprocessing hacks in the AMDGPU calling convention lowering into the generic code.
I'm also not sure what the correct behavior for memlocs where the promoted size is larger than the original value. I've opted to clamp the memory access size to not exceed the value register to avoid the explicit trunc/extend/vector widen/vector extract instruction. This happens for AMDGPU for i8 arguments that end up stack passed, which are promoted to i16 (I think this is a preexisting DAG bug though, and they should not really be promoted when in memory).
show more ...
|
#
23157f3b |
| 07-Jul-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
GlobalISel: Handle EVT argument lowering correctly
handleAssignments was assuming every argument type is an MVT, and assignArg would always fail. This fixes one of the hacks in the current AMDGPU ca
GlobalISel: Handle EVT argument lowering correctly
handleAssignments was assuming every argument type is an MVT, and assignArg would always fail. This fixes one of the hacks in the current AMDGPU calling convention code that pre-processes the arguments.
show more ...
|
Revision tags: llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3 |
|
#
d3085c25 |
| 01-Jul-2020 |
Guillaume Chatelet <gchatelet@google.com> |
[Alignment][NFC] Transition and simplify calls to DL::getABITypeAlignment
This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/
[Alignment][NFC] Transition and simplify calls to DL::getABITypeAlignment
This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Differential Revision: https://reviews.llvm.org/D82956
show more ...
|