|
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3 |
|
| #
c85611e8 |
| 17-Oct-2024 |
goldsteinn <35538541+goldsteinn@users.noreply.github.com> |
[SimplifyLibCall][Attribute] Fix bug where we may keep `range` attr with incompatible type (#112649)
In a variety of places we change the bitwidth of a parameter but don't
update the attributes.
[SimplifyLibCall][Attribute] Fix bug where we may keep `range` attr with incompatible type (#112649)
In a variety of places we change the bitwidth of a parameter but don't
update the attributes.
The issue in this case is from the `range` attribute when inlining
`__memset_chk`. `optimizeMemSetChk` will replace an `i32` with an
`i8`, and if the `i32` had a `range` attr assosiated it will cause an
error.
Fixes #112633
show more ...
|
|
Revision tags: llvmorg-19.1.2 |
|
| #
c4d89203 |
| 07-Oct-2024 |
Austin Kerbow <Austin.Kerbow@amd.com> |
[AMDGPU] Support preloading hidden kernel arguments (#98861)
Adds hidden kernel arguments to the function signature and marks them
inreg if they should be preloaded into user SGPRs. The normal kern
[AMDGPU] Support preloading hidden kernel arguments (#98861)
Adds hidden kernel arguments to the function signature and marks them
inreg if they should be preloaded into user SGPRs. The normal kernarg
preloading logic then takes over with some additional checks for the
correct implicitarg_ptr alignment.
Special care is needed so that metadata for the hidden arguments is not
added twice when generating the code object.
show more ...
|
|
Revision tags: llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3 |
|
| #
e66cfebb |
| 20-Mar-2024 |
Andreas Jonson <andjo403@hotmail.com> |
[ValueTracking] Handle range attributes (#85143)
Handle the range attribute in ValueTracking.
|
|
Revision tags: llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
| #
5da67449 |
| 06-Dec-2022 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
IR: Add nofpclass parameter attribute
This carries a bitmask indicating forbidden floating-point value kinds in the argument or return value. This will enable interprocedural -ffinite-math-only opti
IR: Add nofpclass parameter attribute
This carries a bitmask indicating forbidden floating-point value kinds in the argument or return value. This will enable interprocedural -ffinite-math-only optimizations. This is primarily to cover the no-nans and no-infinities cases, but also covers the other floating point classes for free. Textually, this provides a number of names corresponding to bits in FPClassTest, e.g.
call nofpclass(nan inf) @must_be_finite() call nofpclass(snan) @cannot_be_snan()
This is more expressive than the existing nnan and ninf fast math flags. As an added bonus, you can represent fun things like nanf:
declare nofpclass(inf zero sub norm) float @only_nans()
Compared to nnan/ninf: - Can be applied to individual call operands as well as the return value - Can distinguish signaling and quiet nans - Distinguishes the sign of infinities - Can be safely propagated since it doesn't imply anything about other operands. - Does not apply to FP instructions; it's not a flag
This is one step closer to being able to retire "no-nans-fp-math" and "no-infs-fp-math". The one remaining situation where we have no way to represent no-nans/infs is for loads (if we wanted to solve this we could introduce !nofpclass metadata, following along with noundef/!noundef).
This is to help simplify the GPU builtin math library distribution. Currently the library code has explicit finite math only checks, read from global constants the compiler driver needs to set based on the compiler flags during linking. We end up having to internalize the library into each translation unit in case different linked modules have different math flags. By propagating known-not-nan and known-not-infinity information, we can automatically prune the edge case handling in most functions if the function is only reached from fast math uses.
show more ...
|
| #
80bb57fd |
| 16-Jan-2023 |
Guillaume Chatelet <gchatelet@google.com> |
Deprecate Argument::getParamAlignment()
|
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2 |
|
| #
a5bbc6ef |
| 23-Feb-2022 |
Bill Wendling <isanbard@gmail.com> |
[NFC] Remove unnecessary "#include"s from header files
|
|
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
| #
9290ccc3 |
| 04-Jan-2022 |
serge-sans-paille <sguelton@redhat.com> |
Introduce the AttributeMask class
This class is solely used as a lightweight and clean way to build a set of attributes to be removed from an AttrBuilder. Previously AttrBuilder was used both for bu
Introduce the AttributeMask class
This class is solely used as a lightweight and clean way to build a set of attributes to be removed from an AttrBuilder. Previously AttrBuilder was used both for building and removing, which introduced odd situation like creation of Attribute with dummy value because the only relevant part was the attribute kind.
Differential Revision: https://reviews.llvm.org/D116110
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc1 |
|
| #
05392466 |
| 24-Sep-2021 |
Arthur Eubanks <aeubanks@google.com> |
Reland [IR] Increase max alignment to 4GB
Currently the max alignment representable is 1GB, see D108661. Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bi
Reland [IR] Increase max alignment to 4GB
Currently the max alignment representable is 1GB, see D108661. Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.
This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits. We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.
The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.
Updating clang's max allowed alignment will come in a future patch.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D110451
show more ...
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1 |
|
| #
5b6cae55 |
| 20-May-2021 |
Steven Wu <stevenwu@apple.com> |
[IR][AutoUpgrade] Drop alignment from non-pointer parameters and returns
This is a follow-up of D102201. After some discussion, it is a better idea to upgrade all invalid uses of alignment attribute
[IR][AutoUpgrade] Drop alignment from non-pointer parameters and returns
This is a follow-up of D102201. After some discussion, it is a better idea to upgrade all invalid uses of alignment attributes on function return values and parameters, not just limited to void function return types.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D102726
show more ...
|
| #
f9d932e6 |
| 15-Apr-2021 |
Momchil Velikov <momchil.velikov@arm.com> |
[clang][AArch64] Correctly align HFA arguments when passed on the stack
When we pass a AArch64 Homogeneous Floating-Point Aggregate (HFA) argument with increased alignment requirements, for example
[clang][AArch64] Correctly align HFA arguments when passed on the stack
When we pass a AArch64 Homogeneous Floating-Point Aggregate (HFA) argument with increased alignment requirements, for example
struct S { __attribute__ ((__aligned__(16))) double v[4]; };
Clang uses `[4 x double]` for the parameter, which is passed on the stack at alignment 8, whereas it should be at alignment 16, following Rule C.4 in AAPCS (https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#642parameter-passing-rules)
Currently we don't have a way to express in LLVM IR the alignment requirements of the function arguments. The align attribute is applicable to pointers only, and only for some special ways of passing arguments (e..g byval). When implementing AAPCS32/AAPCS64, clang resorts to dubious hacks of coercing to types, which naturally have the needed alignment. We don't have enough types to cover all the cases, though.
This patch introduces a new use of the stackalign attribute to control stack slot alignment, when and if an argument is passed in memory.
The attribute align is left as an optimizer hint - it still applies to pointer types only and pertains to the content of the pointer, whereas the alignment of the pointer itself is determined by the stackalign attribute.
For byval arguments, the stackalign attribute assumes the role, previously perfomed by align, falling back to align if stackalign` is absent.
On the clang side, when passing arguments using the "direct" style (cf. `ABIArgInfo::Kind`), now we can optionally specify an alignment, which is emitted as the new `stackalign` attribute.
Patch by Momchil Velikov and Lucas Prates.
Differential Revision: https://reviews.llvm.org/D98794
show more ...
|
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4 |
|
| #
9a0c9402 |
| 29-Mar-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Reapply "OpaquePtr: Turn inalloca into a type attribute"
This reverts commit 07e46367baeca96d84b03fa215b41775f69d5989.
|
| #
07e46367 |
| 29-Mar-2021 |
Oliver Stannard <oliver.stannard@linaro.org> |
Revert "Reapply "OpaquePtr: Turn inalloca into a type attribute""
Reverting because test 'Bindings/Go/go.test' is failing on most buildbots.
This reverts commit fc9df309917e57de704f3ce4372138a8d4a2
Revert "Reapply "OpaquePtr: Turn inalloca into a type attribute""
Reverting because test 'Bindings/Go/go.test' is failing on most buildbots.
This reverts commit fc9df309917e57de704f3ce4372138a8d4a23d7a.
show more ...
|
| #
fc9df309 |
| 28-Mar-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Reapply "OpaquePtr: Turn inalloca into a type attribute"
This reverts commit 20d5c42e0ef5d252b434bcb610b04f1cb79fe771.
|
| #
20d5c42e |
| 28-Mar-2021 |
Nico Weber <thakis@chromium.org> |
Revert "OpaquePtr: Turn inalloca into a type attribute"
This reverts commit 4fefed65637ec46c8c2edad6b07b5569ac61e9e5. Broke check-clang everywhere.
|
|
Revision tags: llvmorg-12.0.0-rc3 |
|
| #
4fefed65 |
| 06-Mar-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
OpaquePtr: Turn inalloca into a type attribute
I think byval/sret and the others are close to being able to rip out the code to support the missing type case. A lot of this code is shared with inall
OpaquePtr: Turn inalloca into a type attribute
I think byval/sret and the others are close to being able to rip out the code to support the missing type case. A lot of this code is shared with inalloca, so catch this up to the others so that can happen.
show more ...
|
| #
fa26da05 |
| 19-Mar-2021 |
Philip Reames <listmail@philipreames.com> |
Add a couple of missing attribute query methods [NFC]
|
|
Revision tags: llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2 |
|
| #
4479c0c2 |
| 13-Jan-2021 |
Juneyoung Lee <aqjune@gmail.com> |
Allow nonnull/align attribute to accept poison
Currently LLVM is relying on ValueTracking's `isKnownNonZero` to attach `nonnull`, which can return true when the value is poison. To make the semantic
Allow nonnull/align attribute to accept poison
Currently LLVM is relying on ValueTracking's `isKnownNonZero` to attach `nonnull`, which can return true when the value is poison. To make the semantics of `nonnull` consistent with the behavior of `isKnownNonZero`, this makes the semantics of `nonnull` to accept poison, and return poison if the input pointer isn't null. This makes many transformations like below legal:
``` %p = gep inbounds %x, 1 ; % p is non-null pointer or poison call void @f(%p) ; instcombine converts this to call void @f(nonnull %p) ```
Instead, this semantics makes propagation of `nonnull` to caller illegal. The reason is that, passing poison to `nonnull` does not immediately raise UB anymore, so such program is still well defined, if the callee does not use the argument. Having `noundef` attribute there re-allows this.
``` define void @f(i8* %p) { ; functionattr cannot mark %p nonnull here anymore call void @g(i8* nonnull %p) ; .. because @g never raises UB if it never uses %p. ret void } ```
Another attribute that needs to be updated is `align`. This patch updates the semantics of align to accept poison as well.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D90529
show more ...
|
|
Revision tags: llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4 |
|
| #
d65a7003 |
| 23-Sep-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
OpaquePtr: Add helpers for sret to mirror byval
Sret should really have a type parameter like byval does.
|
|
Revision tags: llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2 |
|
| #
5e999cbe |
| 05-Jun-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
IR: Define byref parameter attribute
This allows tracking the in-memory type of a pointer argument to a function for ABI purposes. This is essentially a stripped down version of byval to remove some
IR: Define byref parameter attribute
This allows tracking the in-memory type of a pointer argument to a function for ABI purposes. This is essentially a stripped down version of byval to remove some of the stack-copy implications in its definition.
This includes the base IR changes, and some tests for places where it should be treated similarly to byval. Codegen support will be in a future patch.
My original attempt at solving some of these problems was to repurpose byval with a different address space from the stack. However, it is technically permitted for the callee to introduce a write to the argument, although nothing does this in reality. There is also talk of removing and replacing the byval attribute, so a new attribute would need to take its place anyway.
This is intended avoid some optimization issues with the current handling of aggregate arguments, as well as fixes inflexibilty in how frontends can specify the kernel ABI. The most honest representation of the amdgpu_kernel convention is to expose all kernel arguments as loads from constant memory. Today, these are raw, SSA Argument values and codegen is responsible for turning these into loads.
Background:
There currently isn't a satisfactory way to represent how arguments for the amdgpu_kernel calling convention are passed. In reality, arguments are passed in a single, flat, constant memory buffer implicitly passed to the function. It is also illegal to call this function in the IR, and this is only ever invoked by a driver of some kind.
It does not make sense to have a stack passed parameter in this context as is implied by byval. It is never valid to write to the kernel arguments, as this would corrupt the inputs seen by other dispatches of the kernel. These argumets are also not in the same address space as the stack, so a copy is needed to an alloca. From a source C-like language, the kernel parameters are invisible. Semantically, a copy is always required from the constant argument memory to a mutable variable.
The current clang calling convention lowering emits raw values, including aggregates into the function argument list, since using byval would not make sense. This has some unfortunate consequences for the optimizer. In the aggregate case, we end up with an aggregate store to alloca, which both SROA and instcombine turn into a store of each aggregate field. The optimizer never pieces this back together to see that this is really just a copy from constant memory, so we end up stuck with expensive stack usage.
This also means the backend dictates the alignment of arguments, and arbitrarily picks the LLVM IR ABI type alignment. By allowing an explicit alignment, frontends can make better decisions. For example, there's real no advantage to an aligment higher than 4, so a frontend could choose to compact the argument layout. Similarly, there is a high penalty to using an alignment lower than 4, so a frontend could opt into more padding for small arguments.
Another design consideration is when it is appropriate to expose the fact that these arguments are all really passed in adjacent memory. Currently we have a late IR optimization pass in codegen to rewrite the kernel argument values into explicit loads to enable vectorization. In most programs, unrelated argument loads can be merged together. However, exposing this property directly from the frontend has some disadvantages. We still need a way to track the original argument sizes and alignments to report to the driver. I find using some side-channel, metadata mechanism to track this unappealing. If the kernel arguments were exposed as a single buffer to begin with, alias analysis would be unaware that the padding bits betewen arguments are meaningless. Another family of problems is there are still some gaps in replacing all of the available parameter attributes with metadata equivalents once lowered to loads.
The immediate plan is to start using this new attribute to handle all aggregate argumets for kernels. Long term, it makes sense to migrate all kernel arguments, including scalars, to be passed indirectly in the same manner.
Additional context is in D79744.
show more ...
|
| #
023883a8 |
| 29-Jun-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
IR: Rename Argument::hasPassPointeeByValueAttr to prepare for byref
When the byref attribute is added, there will need to be two similar functions for the existing cases which have an associate valu
IR: Rename Argument::hasPassPointeeByValueAttr to prepare for byref
When the byref attribute is added, there will need to be two similar functions for the existing cases which have an associate value copy, and byref which does not. Most, but not all of the existing uses will use the existing version.
The associated size function added by D82679 also needs to contextually differ, and will help eliminate a few places still relying on pointee element types.
show more ...
|
| #
6f5d9136 |
| 26-Jun-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
OpaquePtr: Don't check pointee type for byval/preallocated
Since none of these users really care about the actual type, hide the type under a new size-getting attribute to go along with hasPassPoint
OpaquePtr: Don't check pointee type for byval/preallocated
Since none of these users really care about the actual type, hide the type under a new size-getting attribute to go along with hasPassPointeeByValueAttr. This will work better for the future byref attribute, which may end up only tracking the byte size and not the IR type.
We currently have 3 parameter attributes that should carry the type (technically inalloca does not yet). The APIs are somewhat awkward since preallocated/inalloca piggyback on byval in some places, but in others are treated as distinct attributes. Since these are all mutually exclusive, we should probably just merge all the attribute infrastructure treating these as totally distinct attributes.
show more ...
|
|
Revision tags: llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5 |
|
| #
8a887556 |
| 16-Mar-2020 |
Arthur Eubanks <aeubanks@google.com> |
Reland [X86] Codegen for preallocated
See https://reviews.llvm.org/D74651 for the preallocated IR constructs and LangRef changes.
In X86TargetLowering::LowerCall(), if a call is preallocated, recor
Reland [X86] Codegen for preallocated
See https://reviews.llvm.org/D74651 for the preallocated IR constructs and LangRef changes.
In X86TargetLowering::LowerCall(), if a call is preallocated, record each argument's offset from the stack pointer and the total stack adjustment. Associate the call Value with an integer index. Store the info in X86MachineFunctionInfo with the integer index as the key.
This adds two new target independent ISDOpcodes and two new target dependent Opcodes corresponding to @llvm.call.preallocated.{setup,arg}.
The setup ISelDAG node takes in a chain and outputs a chain and a SrcValue of the preallocated call Value. It is lowered to a target dependent node with the SrcValue replaced with the integer index key by looking in X86MachineFunctionInfo. In X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to an %esp adjustment, the exact amount determined by looking in X86MachineFunctionInfo with the integer index key.
The arg ISelDAG node takes in a chain, a SrcValue of the preallocated call Value, and the arg index int constant. It produces a chain and the pointer fo the arg. It is lowered to a target dependent node with the SrcValue replaced with the integer index key by looking in X86MachineFunctionInfo. In X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to a lea of the stack pointer plus an offset determined by looking in X86MachineFunctionInfo with the integer index key.
Force any function containing a preallocated call to use the frame pointer.
Does not yet handle a setup without a call, or a conditional call. Does not yet handle musttail. That requires a LangRef change first.
Tried to look at all references to inalloca and see if they apply to preallocated. I've made preallocated versions of tests testing inalloca whenever possible and when they make sense (e.g. not alloca related, inalloca edge cases).
Aside from the tests added here, I checked that this codegen produces correct code for something like
``` struct A { A(); A(A&&); ~A(); };
void bar() { foo(foo(foo(foo(foo(A(), 4), 5), 6), 7), 8); } ```
by replacing the inalloca version of the .ll file with the appropriate preallocated code. Running the executable produces the same results as using the current inalloca implementation.
Reverted due to unexpectedly passing tests, added REQUIRES: asserts for reland.
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77689
show more ...
|
| #
b8cbff51 |
| 20-May-2020 |
Arthur Eubanks <aeubanks@google.com> |
Revert "[X86] Codegen for preallocated"
This reverts commit 810567dc691a57c8c13fef06368d7549f7d9c064.
Some tests are unexpectedly passing
|
| #
810567dc |
| 16-Mar-2020 |
Arthur Eubanks <aeubanks@google.com> |
[X86] Codegen for preallocated
See https://reviews.llvm.org/D74651 for the preallocated IR constructs and LangRef changes.
In X86TargetLowering::LowerCall(), if a call is preallocated, record each
[X86] Codegen for preallocated
See https://reviews.llvm.org/D74651 for the preallocated IR constructs and LangRef changes.
In X86TargetLowering::LowerCall(), if a call is preallocated, record each argument's offset from the stack pointer and the total stack adjustment. Associate the call Value with an integer index. Store the info in X86MachineFunctionInfo with the integer index as the key.
This adds two new target independent ISDOpcodes and two new target dependent Opcodes corresponding to @llvm.call.preallocated.{setup,arg}.
The setup ISelDAG node takes in a chain and outputs a chain and a SrcValue of the preallocated call Value. It is lowered to a target dependent node with the SrcValue replaced with the integer index key by looking in X86MachineFunctionInfo. In X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to an %esp adjustment, the exact amount determined by looking in X86MachineFunctionInfo with the integer index key.
The arg ISelDAG node takes in a chain, a SrcValue of the preallocated call Value, and the arg index int constant. It produces a chain and the pointer fo the arg. It is lowered to a target dependent node with the SrcValue replaced with the integer index key by looking in X86MachineFunctionInfo. In X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to a lea of the stack pointer plus an offset determined by looking in X86MachineFunctionInfo with the integer index key.
Force any function containing a preallocated call to use the frame pointer.
Does not yet handle a setup without a call, or a conditional call. Does not yet handle musttail. That requires a LangRef change first.
Tried to look at all references to inalloca and see if they apply to preallocated. I've made preallocated versions of tests testing inalloca whenever possible and when they make sense (e.g. not alloca related, inalloca edge cases).
Aside from the tests added here, I checked that this codegen produces correct code for something like
``` struct A { A(); A(A&&); ~A(); };
void bar() { foo(foo(foo(foo(foo(A(), 4), 5), 6), 7), 8); } ```
by replacing the inalloca version of the .ll file with the appropriate preallocated code. Running the executable produces the same results as using the current inalloca implementation.
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77689
show more ...
|
| #
a90948fd |
| 30-Apr-2020 |
Arthur Eubanks <aeubanks@google.com> |
[NFC] Rename *ByValOrInalloca* to *PassPointeeByValue*
Summary: In preparation for preallocated.
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D
[NFC] Rename *ByValOrInalloca* to *PassPointeeByValue*
Summary: In preparation for preallocated.
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79152
show more ...
|