History log of /llvm-project/llvm/lib/Target/X86/X86PreTileConfig.cpp (Results 26 – 37 of 37)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# a3b52a9d 14-Apr-2021 Wang, Pengfei <pengfei.wang@intel.com>

[X86][AMX] Refactor for PostRA ldtilecfg pass.

This is a follow up of D99010. We didn't consider the live range of shape registers when hoist ldtilecfg. There maybe risks, e.g. we happen to insert i

[X86][AMX] Refactor for PostRA ldtilecfg pass.

This is a follow up of D99010. We didn't consider the live range of shape registers when hoist ldtilecfg. There maybe risks, e.g. we happen to insert it to an invalid range of some registers and get unexpected error.

This patch fixes this problem by storing the value to corresponding stack place of ldtilecfg after all its definition immediately.

This patch also fix a problem in previous code: If we don't have a ldtilecfg which dominates all AMX instructions, we cannot initialize shapes for other ldtilecfg.

There're still some optimization points left. E.g. eliminate unused mov instructions, break the def-use dependency before RA etc.

Reviewed By: LuoYuanke, xiangzhangllvm

Differential Revision: https://reviews.llvm.org/D99966

show more ...


# 4cbaaf4a 12-Apr-2021 Wang, Pengfei <pengfei.wang@intel.com>

[X86][AMX] Hoist ldtilecfg

The previous code calculated the first ldtilecfg by dominating all AMX registers' def. This may result in the ldtilecfg being inserted into a loop.

This patch try to calc

[X86][AMX] Hoist ldtilecfg

The previous code calculated the first ldtilecfg by dominating all AMX registers' def. This may result in the ldtilecfg being inserted into a loop.

This patch try to calculate the nearest point where all shapes of AMX registers are reachable.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D99010

show more ...


Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4
# 2327513b 20-Mar-2021 Wang, Pengfei <pengfei.wang@intel.com>

[X86] Fix a bug when calculating the ldtilecfg insertion points.

The BB we initialized the ldtilecfg is special. We don't need to check
if its predecessor BBs need to insert ldtilecfg for calls.

We

[X86] Fix a bug when calculating the ldtilecfg insertion points.

The BB we initialized the ldtilecfg is special. We don't need to check
if its predecessor BBs need to insert ldtilecfg for calls.

We reused the flag HasCallBeforeAMX, so that the predecessors won't be
added to CfgNeedInsert.

This case happens only when the entry BB is in a loop. We need to hoist
the first tile config point out of the loop in future.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D98845

show more ...


# 0002d4bf 18-Mar-2021 Bing1 Yu <bing1.yu@intel.com>

[X86][AMX][NFC] Give correct Passname for Tile Register Pre-configure


Revision tags: llvmorg-12.0.0-rc3
# 4bc7c863 24-Feb-2021 Liu, Chen3 <chen3.liu@intel.com>

[X86] Support amx-bf16 intrinsic.

Adding support for intrinsics of AMX-BF16.
This patch alse fix a bug that AMX-INT8 instructions will be selected with wrong
predicate.

Differential Revision: https

[X86] Support amx-bf16 intrinsic.

Adding support for intrinsics of AMX-BF16.
This patch alse fix a bug that AMX-INT8 instructions will be selected with wrong
predicate.

Differential Revision: https://reviews.llvm.org/D97358

show more ...


Revision tags: llvmorg-12.0.0-rc2
# f8b9035a 23-Feb-2021 Liu, Chen3 <chen3.liu@intel.com>

[X86] Support amx-int8 intrinsic.

Adding support for intrinsics of TDPBSUD/TDPBUSD/TDPBUUD.

Differential Revision: https://reviews.llvm.org/D97259


# e9c11c19 18-Feb-2021 Wang, Pengfei <pengfei.wang@intel.com>

[X86] Zero AMX config buffer for non AVX512 cases.

Zero AMX config buffer for non AVX512 cases.

Differential Revision: https://reviews.llvm.org/D96927


Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3
# a5d9e0c7 30-Jan-2021 Wang, Pengfei <pengfei.wang@intel.com>

[X86] Fix tile config register spill issue.

This is an optimized approach for D94155.

Previous code build the model that tile config register is the user of
each AMX instruction. There is a problem

[X86] Fix tile config register spill issue.

This is an optimized approach for D94155.

Previous code build the model that tile config register is the user of
each AMX instruction. There is a problem for the tile config register
spill. When across function, the ldtilecfg instruction may be inserted
on each AMX instruction which use tile config register. This cause all
tile data register clobber.

To fix this issue, we remove the model of tile config register. Instead,
we analyze the AMX instructions between one call to another. We will
insert ldtilecfg after the first call if we find any AMX instructions.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D95136

show more ...


Revision tags: llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2
# 64132f54 21-Jan-2021 Luo, Yuanke <yuanke.luo@intel.com>

Revert "[X86][AMX] Fix tile config register spill issue."

This reverts commit 20013d02f3352a88d0838eed349abc9a2b0e9cc0.


Revision tags: llvmorg-11.1.0-rc1
# 20013d02 05-Jan-2021 Luo, Yuanke <yuanke.luo@intel.com>

[X86][AMX] Fix tile config register spill issue.

Previous code build the model that tile config register is the user of
each AMX instruction. There is a problem for the tile config register
spill. W

[X86][AMX] Fix tile config register spill issue.

Previous code build the model that tile config register is the user of
each AMX instruction. There is a problem for the tile config register
spill. When across function, the ldtilecfg instruction may be inserted
on each AMX instruction which use tile config register. This cause all
tile data register clobber.
To fix this issue, we remove the model of tile config register. We
analyze the regmask of call instruction and insert ldtilecfg if there is
any tile data register live across the call. Inserting the sttilecfg
before the call is unneccessary, because the tile config doesn't change
and we can just reload the config.
Besides we also need check tile config register interference. Since we
don't model the config register we should check interference from the
ldtilecfg to each tile data register def.
ldtilecfg
/ \
BB1 BB2
/ \
call BB3
/ \
%1=tileload %2=tilezero
We can start from the instruction of each tile def, and backward to
ldtilecfg. If there is any call instruction, and tile data register is
not preserved, we should insert ldtilecfg after the call instruction.

Differential Revision: https://reviews.llvm.org/D94155

show more ...


Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2
# 08665b18 08-Dec-2020 Luo, Yuanke <yuanke.luo@intel.com>

Support tilezero intrinsic and c interface for AMX.

Differential Revision: https://reviews.llvm.org/D92837


Revision tags: llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3
# f80b2987 06-Sep-2020 Luo, Yuanke <yuanke.luo@intel.com>

[X86] AMX programming model.
This patch implements amx programming model that discussed in llvm-dev
(http://lists.llvm.org/pipermail/llvm-dev/2020-August/144302.html).
Thank Hal for the good sugge

[X86] AMX programming model.
This patch implements amx programming model that discussed in llvm-dev
(http://lists.llvm.org/pipermail/llvm-dev/2020-August/144302.html).
Thank Hal for the good suggestion in the RA. The fast RA is not in the patch yet.
This patch implemeted 7 components.

1. The c interface to end user.
2. The AMX intrinsics in LLVM IR.
3. Transform load/store <256 x i32> to AMX intrinsics or split the
type into two <128 x i32>.
4. The Lowering from AMX intrinsics to AMX pseudo instruction.
5. Insert psuedo ldtilecfg and build the def-use between ldtilecfg to amx
intruction.
6. The register allocation for tile register.
7. Morph AMX pseudo instruction to AMX real instruction.

Change-Id: I935e1080916ffcb72af54c2c83faa8b2e97d5cb0

Differential Revision: https://reviews.llvm.org/D87981

show more ...


12