History log of /llvm-project/llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp (Results 1 – 25 of 136)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7
# de12836f 09-Jan-2025 choikwa <5455710+choikwa@users.noreply.github.com>

[AMDGPU] Rework getDivNumBits API (#119768)

Rework involves below:
- Return unsigned value, the number of div/rem bits actually needed.
- Change from AtLeast(SignBits) to MaxDivBits hint.
- Use M

[AMDGPU] Rework getDivNumBits API (#119768)

Rework involves below:
- Return unsigned value, the number of div/rem bits actually needed.
- Change from AtLeast(SignBits) to MaxDivBits hint.
- Use MaxDivBits hint for unsigned case.
- Remove unnecessary second early exit.

Mostly NFC changes.

show more ...


# 8d2e6118 07-Jan-2025 choikwa <5455710+choikwa@users.noreply.github.com>

[AMDGPU] Calculate getDivNumBits' AtLeast using bitwidth (#121758)

Previously in shrinkDivRem64, it used fixed value 32 for AtLeast which
meant that <64bit divisions would be rejected from shrinkin

[AMDGPU] Calculate getDivNumBits' AtLeast using bitwidth (#121758)

Previously in shrinkDivRem64, it used fixed value 32 for AtLeast which
meant that <64bit divisions would be rejected from shrinking since logic
depended only on number of sign bits. I.e. 'idiv i48 %0, %1' would
return 24 for number of sign bits if %0,%1 both had 24 division bits,
and was rejected.

show more ...


Revision tags: llvmorg-19.1.6
# 4a0d53a0 13-Dec-2024 Ramkumar Ramachandra <ramkumar.ramachandra@codasip.com>

PatternMatch: migrate to CmpPredicate (#118534)

With the introduction of CmpPredicate in 51a895a (IR: introduce struct
with CmpInst::Predicate and samesign), PatternMatch is one of the first
key p

PatternMatch: migrate to CmpPredicate (#118534)

With the introduction of CmpPredicate in 51a895a (IR: introduce struct
with CmpInst::Predicate and samesign), PatternMatch is one of the first
key pieces of infrastructure that must be updated to match a CmpInst
respecting samesign information. Implement this change to Cmp-matchers.

This is a preparatory step in migrating the codebase over to
CmpPredicate. Since we no functional changes are desired at this stage,
we have chosen not to migrate CmpPredicate::operator==(CmpPredicate)
calls to use CmpPredicate::getMatching(), as that would have visible
impact on tests that are not yet written: instead, we call
CmpPredicate::operator==(Predicate), preserving the old behavior, while
also inserting a few FIXME comments for follow-ups.

show more ...


# 463e93b9 12-Dec-2024 choikwa <5455710+choikwa@users.noreply.github.com>

Reapply [AMDGPU] prevent shrinking udiv/urem if either operand exceeds signed max (#119325)

This reverts commit 254d206ee2a337cb38ba347c896f7c6a14c7f218.

+Added a fix in ExpandDivRem24 to disqual

Reapply [AMDGPU] prevent shrinking udiv/urem if either operand exceeds signed max (#119325)

This reverts commit 254d206ee2a337cb38ba347c896f7c6a14c7f218.

+Added a fix in ExpandDivRem24 to disqualify if DivNumBits exceed 24.

Original commit & msg:
ce6e955ac374f2b86cbbb73b2f32174dffd85f25.
Handle signed and unsigned path differently in getDivNumBits. Using
computeKnownBits, this rejects shrinking unsigned div/rem if operands
exceed signed max since we know NumSignBits will be always 0.

show more ...


# 254d206e 09-Dec-2024 Joseph Huber <huberjn@outlook.com>

Revert "Reapply "[AMDGPU] prevent shrinking udiv/urem if either operand is in… (#118928)"

This reverts commit 509893b58ff444a6f080946bd368e9bde7668f13.

This broke the libc build again https://lab.l

Revert "Reapply "[AMDGPU] prevent shrinking udiv/urem if either operand is in… (#118928)"

This reverts commit 509893b58ff444a6f080946bd368e9bde7668f13.

This broke the libc build again https://lab.llvm.org/buildbot/#/builders/73/builds/9787.

show more ...


# 509893b5 07-Dec-2024 choikwa <5455710+choikwa@users.noreply.github.com>

Reapply "[AMDGPU] prevent shrinking udiv/urem if either operand is in… (#118928)

… (SignedMax,UnsignedMax] (#116733)"

This reverts commit 905e831f8c8341e53e7e3adc57fd20b8e08eb999.

Handle signe

Reapply "[AMDGPU] prevent shrinking udiv/urem if either operand is in… (#118928)

… (SignedMax,UnsignedMax] (#116733)"

This reverts commit 905e831f8c8341e53e7e3adc57fd20b8e08eb999.

Handle signed and unsigned path differently in getDivNumBits. Using
computeKnownBits, this rejects shrinking unsigned div/rem if operands
exceed signed max since we know NumSignBits will be always 0.

Rebased and re-attempt after first one was reverted due to unrelated
failure in LibC (should be fixed by now I'm told).

show more ...


# 9ad09b29 03-Dec-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] Refine AMDGPUCodeGenPrepareImpl class. NFC. (#118461)

Use references instead of pointers for most state, initialize it all in
the constructor, and common up some of the initialization betw

[AMDGPU] Refine AMDGPUCodeGenPrepareImpl class. NFC. (#118461)

Use references instead of pointers for most state, initialize it all in
the constructor, and common up some of the initialization between the
legacy and new pass manager paths.

show more ...


Revision tags: llvmorg-19.1.5
# 3923e045 28-Nov-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] Preserve all analyses if nothing changed (#117994)


# 905e831f 21-Nov-2024 Joseph Huber <huberjn@outlook.com>

Revert "[AMDGPU] prevent shrinking udiv/urem if either operand is in (SignedMax,UnsignedMax] (#116733)"

This reverts commit b8e1d4dbea8905e48d51a70bf75cb8fababa4a60.

Causes failures on the `libc` t

Revert "[AMDGPU] prevent shrinking udiv/urem if either operand is in (SignedMax,UnsignedMax] (#116733)"

This reverts commit b8e1d4dbea8905e48d51a70bf75cb8fababa4a60.

Causes failures on the `libc` test suite https://lab.llvm.org/buildbot/#/builders/73/builds/8871

show more ...


# b8e1d4db 20-Nov-2024 choikwa <5455710+choikwa@users.noreply.github.com>

[AMDGPU] prevent shrinking udiv/urem if either operand is in (SignedMax,UnsignedMax] (#116733)

Do this by using ComputeKnownBits and checking for !isNonNegative and
isUnsigned. This rejects shrinki

[AMDGPU] prevent shrinking udiv/urem if either operand is in (SignedMax,UnsignedMax] (#116733)

Do this by using ComputeKnownBits and checking for !isNonNegative and
isUnsigned. This rejects shrinking unsigned div/rem if operands exceed
smax_bitwidth since we know NumSignBits will be always 0.

show more ...


Revision tags: llvmorg-19.1.4, llvmorg-19.1.3
# 85c17e40 17-Oct-2024 Jay Foad <jay.foad@amd.com>

[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112706)

Convert many instances of:
Fn = Intrinsic::getOrInsertDeclaration(...);
CreateCall(Fn, ...)
to the equivalent CreateIntrinsi

[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112706)

Convert many instances of:
Fn = Intrinsic::getOrInsertDeclaration(...);
CreateCall(Fn, ...)
to the equivalent CreateIntrinsic call.

show more ...


Revision tags: llvmorg-19.1.2
# fa789dff 11-Oct-2024 Rahul Joshi <rjoshi@nvidia.com>

[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752)

Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is a

[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752)

Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).

show more ...


Revision tags: llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# 79516ddb 02-Jul-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Fix assert from wrong address space size assumption (#97267)

This was assuming the source address space was at least as large
as the destination of the cast. I'm not sure why this was casti

AMDGPU: Fix assert from wrong address space size assumption (#97267)

This was assuming the source address space was at least as large
as the destination of the cast. I'm not sure why this was casting
to begin with; the assumption seems to be the source
address space from the root addrspacecast matches the underlying
object so directly check that.

Fixes #97457

show more ...


# d75f9dd1 24-Jun-2024 Stephen Tozer <stephen.tozer@sony.com>

Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497)"

Reverts the above commit, as it updates a common header function and
did not update all callsites:

https://lab.llvm.org/buildbot

Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497)"

Reverts the above commit, as it updates a common header function and
did not update all callsites:

https://lab.llvm.org/buildbot/#/builders/29/builds/382

This reverts commit 6481dc57612671ebe77fe9c34214fba94e1b3b27.

show more ...


# 6481dc57 24-Jun-2024 Stephen Tozer <stephen.tozer@sony.com>

[IR][NFC] Update IRBuilder to use InsertPosition (#96497)

Uses the new InsertPosition class (added in #94226) to simplify some of
the IRBuilder interface, and removes the need to pass a BasicBlock

[IR][NFC] Update IRBuilder to use InsertPosition (#96497)

Uses the new InsertPosition class (added in #94226) to simplify some of
the IRBuilder interface, and removes the need to pass a BasicBlock
alongside a BasicBlock::iterator, using the fact that we can now get the
parent basic block from the iterator even if it points to the sentinel.
This patch removes the BasicBlock argument from each constructor or call
to setInsertPoint.

This has no functional effect, but later on as we look to remove the
`Instruction *InsertBefore` argument from instruction-creation
(discussed
[here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)),
this will simplify the process by allowing us to deprecate the
InsertPosition constructor directly and catch all the cases where we use
instructions rather than iterators.

show more ...


Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3
# 0a43ca73 27-Mar-2024 Shilei Tian <i@tianshilei.me>

[AMDGPU] Fix missing `IsExact` flag when expanding vector binary operator (#86712)


# 4a026b50 20-Mar-2024 Peter Rong <peterrong96@gmail.com>

[AMDGCN] Use ZExt when handling indices in insertment element (#85718)

When i1 true is used as an index, SExt extends it to i32 -1. This would
cause BitVector to overflow.
The language manual hav

[AMDGCN] Use ZExt when handling indices in insertment element (#85718)

When i1 true is used as an index, SExt extends it to i32 -1. This would
cause BitVector to overflow.
The language manual have specified that the index shall be treated as an
unsigned number, this patch fixes that.
(https://llvm.org/docs/LangRef.html#insertelement-instruction)

This patch fixes #85717

---------

Signed-off-by: Peter Rong <PeterRong96@gmail.com>

show more ...


Revision tags: llvmorg-18.1.2
# 3ab1481f 18-Mar-2024 Orlando Cazalet-Hyams <orlando.hyams@sony.com>

[RemoveDIs] Use getFirstNonPHIIt to fix crash #85472 (#85618)


Revision tags: llvmorg-18.1.1
# 756166e3 01-Mar-2024 Pierre van Houtryve <pierre.vanhoutryve@amd.com>

[AMDGPU] Improve detection of non-null addrspacecast operands (#82311)

Use IR analysis to infer when an addrspacecast operand is nonnull, then
lower it to an intrinsic that the DAG can use to skip

[AMDGPU] Improve detection of non-null addrspacecast operands (#82311)

Use IR analysis to infer when an addrspacecast operand is nonnull, then
lower it to an intrinsic that the DAG can use to skip the null check.

I did this using an intrinsic as it's non-intrusive. An alternative
would have been to allow something like `!nonnull` on `addrspacecast`
then lower that to a custom opcode (or add an operand to the
addrspacecast MIR/DAG opcodes), but it's a lot of boilerplate for just
one target's use case IMO.

I'm hoping that when we switch to GISel that we can move all this logic
to the MIR level without losing info, but currently the DAG doesn't see
enough so we need to act in CGP.

Fixes: SWDEV-316445

show more ...


Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2
# e5638c5a 06-Feb-2024 choikwa <5455710+choikwa@users.noreply.github.com>

[AMDGPU] Use correct number of bits needed for div/rem shrinking (#80622)

There was an error where dividend of type i64 and actual used number of
bits of 32 fell into path that assumes only 24 bits

[AMDGPU] Use correct number of bits needed for div/rem shrinking (#80622)

There was an error where dividend of type i64 and actual used number of
bits of 32 fell into path that assumes only 24 bits being used. Check
that AtLeast field is used correctly when using computeNumSignBits and
add necessary extend/trunc for 32 bits path.

Regolden and update testcases.

@jrbyrnes @bcahoon @arsenm @rampitec

show more ...


# 930996e9 05-Feb-2024 Yingwei Zheng <dtcxzyw2333@gmail.com>

[ValueTracking][NFC] Pass `SimplifyQuery` to `computeKnownFPClass` family (#80657)

This patch refactors the interface of the `computeKnownFPClass` family
to pass `SimplifyQuery` directly.
The moti

[ValueTracking][NFC] Pass `SimplifyQuery` to `computeKnownFPClass` family (#80657)

This patch refactors the interface of the `computeKnownFPClass` family
to pass `SimplifyQuery` directly.
The motivation of this patch is to compute known fpclass with
`DomConditionCache`, which was introduced by
https://github.com/llvm/llvm-project/pull/73662. With
`DomConditionCache`, we can do more optimization with context-sensitive
information.

Example (extracted from
[fmt/format.h](https://github.com/fmtlib/fmt/blob/e17bc67547a66cdd378ca6a90c56b865d30d6168/include/fmt/format.h#L3555-L3566)):
```
define float @test(float %x, i1 %cond) {
%i32 = bitcast float %x to i32
%cmp = icmp slt i32 %i32, 0
br i1 %cmp, label %if.then1, label %if.else

if.then1:
%fneg = fneg float %x
br label %if.end

if.else:
br i1 %cond, label %if.then2, label %if.end

if.then2:
br label %if.end

if.end:
%value = phi float [ %fneg, %if.then1 ], [ %x, %if.then2 ], [ %x, %if.else ]
%ret = call float @llvm.fabs.f32(float %value)
ret float %ret
}
```
We can prove the signbit of `%value` is always zero. Then the fabs can
be eliminated.

show more ...


Revision tags: llvmorg-18.1.0-rc1, llvmorg-19-init
# fac093dd 13-Dec-2023 Piotr Sobczak <piotr.sobczak@amd.com>

[AMDGPU] Update IEEE and DX10_CLAMP for GFX12 (#75030)

Co-authored-by: Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>


# 8a66510f 30-Nov-2023 Pierre van Houtryve <pierre.vanhoutryve@amd.com>

[AMDGPU] Don't create mulhi_24 in CGP (#72983)

Instead, create a mul24 with a 64 bit result and let ISel take care of
it.

This allows patterns to simply match mul24 even for 64-bit muls instead

[AMDGPU] Don't create mulhi_24 in CGP (#72983)

Instead, create a mul24 with a 64 bit result and let ISel take care of
it.

This allows patterns to simply match mul24 even for 64-bit muls instead of having to match both mul/mulhi and a buildvector/bitconvert/etc.

show more ...


Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0
# 231aa0f2 13-Sep-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Avoid creating vector extracts if we aren't going to do anything

Try to avoid expensive checks failures from reporting no changes
when some dead instructions were introduced.


Revision tags: llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3
# 72a7024a 16-Aug-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Correctly lower llvm.sqrt.f32

Make codegen emit correctly rounded sqrt by default.

Emit the fast but only kind of fast expansion in AMDGPUCodeGenPrepare
based on !fpmath, like the fdiv case

AMDGPU: Correctly lower llvm.sqrt.f32

Make codegen emit correctly rounded sqrt by default.

Emit the fast but only kind of fast expansion in AMDGPUCodeGenPrepare
based on !fpmath, like the fdiv case. Hack around visitation ordering
problems from AMDGPUCodeGenPrepare using forward iteration instead of
a well behaved combiner.

https://reviews.llvm.org/D158129

show more ...


123456