History log of /llvm-project/llvm/lib/Target/X86/X86FastPreTileConfig.cpp (Results 1 – 10 of 10)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# dfe43bd1 09-Nov-2024 Kazu Hirata <kazu@google.com>

[X86] Remove unused includes (NFC) (#115593)

Identified with misc-include-cleaner.


# c72a751d 01-Nov-2024 Phoebe Wang <phoebe.wang@intel.com>

[X86][AMX] Support AMX-TRANSPOSE (#113532)

Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368


Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8
# 9a2c8418 12-Jun-2024 aengelke <engelke@in.tum.de>

[X86] Early exit MIR AMX passes when AMX is unused (#94989)

Follow-up of #94358. Do the checks even before calling getRegisterInfo
etc., because some of these are virtual function calls.


Revision tags: llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1
# cfd91199 26-Jan-2024 Evgenii Kudriashov <evgenii.kudriashov@intel.com>

[X86] Skip unused VRegs traverse (#78229)

Almost all loops with getNumVirtRegs skip unused registers by means
of reg_nodbg_empty or empty live interval. Except for these two cases
that are reveale

[X86] Skip unused VRegs traverse (#78229)

Almost all loops with getNumVirtRegs skip unused registers by means
of reg_nodbg_empty or empty live interval. Except for these two cases
that are revealed by GlobalISel since it can skip RegClass assignment
for unused registers.

Closes #64452, closes #71926

show more ...


Revision tags: llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4
# 3af0ff99 22-Oct-2023 Kazu Hirata <kazu@google.com>

[llvm] Stop including llvm/ADT/DepthFirstIterator.h (NFC)

Identified with misc-include-cleaner.


Revision tags: llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6
# b5efec4b 24-Nov-2022 Christudasan Devadasan <Christudasan.Devadasan@amd.com>

[CodeGen] Additional Register argument to storeRegToStackSlot/loadRegFromStackSlot

With D134950, targets get notified when a virtual register is created and/or
cloned. Targets can do the needful wit

[CodeGen] Additional Register argument to storeRegToStackSlot/loadRegFromStackSlot

With D134950, targets get notified when a virtual register is created and/or
cloned. Targets can do the needful with the delegate callback. AMDGPU propagates
the virtual register flags maintained in the target file itself. They are useful
to identify a certain type of machine operands while inserting spill stores and
reloads. Since RegAllocFast spills the physical register itself, there is no way
its virtual register can be mapped back to retrieve the flags. It can be solved
by passing the virtual register as an additional argument. This argument has no
use when the spill interfaces are called during the greedy allocator or even the
PrologEpilogInserter and can pass a null register in such cases.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D138656

show more ...


Revision tags: llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6
# 54ec8e25 15-Jun-2022 Luo, Yuanke <yuanke.luo@intel.com>

[X86][AMX] Fix klockwork issue.


Revision tags: llvmorg-14.0.5
# aaaf9ced 27-May-2022 Luo, Yuanke <yuanke.luo@intel.com>

[X86][AMX] Replace LDTILECFG with PLDTILECFGV on auto-config.

There is intrinsic `@llvm.x86.ldtilecfg` which is lowered to LDTILECFG.
This intrinsic is open for user to configure tile registers by
t

[X86][AMX] Replace LDTILECFG with PLDTILECFGV on auto-config.

There is intrinsic `@llvm.x86.ldtilecfg` which is lowered to LDTILECFG.
This intrinsic is open for user to configure tile registers by
themselves. There is a chance that `@llvm.x86.ldtilecfg` would be mixed
with the new AMX intrinsics which depend on compiler to configure tile
registers. Separate pusedo instruction PLDTILECFGV would avoid
unexpected behavious when `@llvm.x86.ldtilecfg` is mixed with new AMX
intrinsics. Though user should not mix the two programming model,
compiler should avoid crash or UB when they are mixed.

Differential Revision: https://reviews.llvm.org/D126519

show more ...


Revision tags: llvmorg-14.0.4
# 3b1de7ab 24-May-2022 Luo, Yuanke <yuanke.luo@intel.com>

[X86][AMX] Reduce the compiling time for non-amx code.

Differential Revision: https://reviews.llvm.org/D126280


# 496156ac 03-May-2022 Luo, Yuanke <yuanke.luo@intel.com>

[X86][AMX] Multiple configure for AMX register.

The previous solution depends on variable name to record the shape
information. However it is not reliable, because in release build
compiler would no

[X86][AMX] Multiple configure for AMX register.

The previous solution depends on variable name to record the shape
information. However it is not reliable, because in release build
compiler would not set the variable name. It can be accomplished with an
additional option `fno-discard-value-names`, but it is not acceptable
for users.
This patch is to preconfigure the tile register with machine
instruction. It follow the same way what sigle configure does. In the
future we can fall back to multiple configure when single configure
fails due to the shape dependency issue.
The algorithm to configure the tile register is simple in the patch. We
may improve it in the future. It configure tile register based on basic
block. Compiler would spill the tile register if it live out the basic
block. After the configure there should be no spill across tile
confgiure in the register alloction. Just like fast register allocation
the algorithm walk the instruction in reverse order. When the shape
dependency doesn't meet, it insert ldtilecfg after the last instruction
that define the shape.
In post configuration compiler also walk the basic block to collect the
physical tile register number and generate instruction to fill the stack
slot for the correponding shape information.
TODO: There is some following work in D125602. The risk is modifying the
fast RA may cause regression as fast RA is usded for different targets.
We may create an independent RA for tile register.

Differential Revision: https://reviews.llvm.org/D125075

show more ...