Revision tags: llvmorg-21-init |
|
#
48803bc8 |
| 17-Jan-2025 |
Phoebe Wang <phoebe.wang@intel.com> |
[X86][AMX-AVX512][NFC] Remove P from intrinsic and instruction name (#123270)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
|
Revision tags: llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4 |
|
#
8f440137 |
| 09-Nov-2024 |
Phoebe Wang <phoebe.wang@intel.com> |
Reland "[X86][AMX] Support AMX-AVX512" (#115581)
Resolve compile fail without SSE2.
|
#
ff225154 |
| 09-Nov-2024 |
Alan Zhao <ayzhao@google.com> |
Revert "[X86][AMX] Support AMX-AVX512" (#115570)
Reverts llvm/llvm-project#114070
Reason: Causes `immintrin.h` to fail to compile if `-msse` and
`-mno-sse2` are passed to clang:
https://github.
Revert "[X86][AMX] Support AMX-AVX512" (#115570)
Reverts llvm/llvm-project#114070
Reason: Causes `immintrin.h` to fail to compile if `-msse` and
`-mno-sse2` are passed to clang:
https://github.com/llvm/llvm-project/pull/114070#issuecomment-2465926700
show more ...
|
#
58a17e1b |
| 08-Nov-2024 |
Phoebe Wang <phoebe.wang@intel.com> |
[X86][AMX] Support AMX-AVX512 (#114070)
|
#
c72a751d |
| 01-Nov-2024 |
Phoebe Wang <phoebe.wang@intel.com> |
[X86][AMX] Support AMX-TRANSPOSE (#113532)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
|
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init |
|
#
0f0cfcff |
| 19-Jul-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
CodeGen: Avoid some references to MachineFunction's getMMI (#99652)
MachineFunction's probably should not include a backreference to
the owning MachineModuleInfo. Most of these references were used
CodeGen: Avoid some references to MachineFunction's getMMI (#99652)
MachineFunction's probably should not include a backreference to
the owning MachineModuleInfo. Most of these references were used
just to query the MCContext, which MachineFunction already directly
stores. Other contexts are using it to query the LLVMContext, which
can already be accessed through the IR function reference.
show more ...
|
#
79d0de2a |
| 09-Jul-2024 |
paperchalice <liujunchang97@outlook.com> |
[CodeGen][NewPM] Port `machine-loops` to new pass manager (#97793)
- Add `MachineLoopAnalysis`.
- Add `MachineLoopPrinterPass`.
- Convert to `MachineLoopInfoWrapperPass` in legacy pass manager.
|
#
4169338e |
| 28-Jun-2024 |
Nikita Popov <npopov@redhat.com> |
[IR] Don't include Module.h in Analysis.h (NFC) (#97023)
Replace it with a forward declaration instead. Analysis.h is pulled in
by all passes, but not all passes need to access the module.
|
Revision tags: llvmorg-18.1.8 |
|
#
55bc04f6 |
| 12-Jun-2024 |
aengelke <engelke@in.tum.de> |
[X86] Replace hasVirtualTileReg with AMXProgModel (#95105)
Cleanup after AMXProgModel introduction. AMXProgModel is ManagedRA
whenever virtual tile registers exist at some point.
|
#
9a2c8418 |
| 12-Jun-2024 |
aengelke <engelke@in.tum.de> |
[X86] Early exit MIR AMX passes when AMX is unused (#94989)
Follow-up of #94358. Do the checks even before calling getRegisterInfo
etc., because some of these are virtual function calls.
|
Revision tags: llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3 |
|
#
4d0f1e32 |
| 11-Aug-2023 |
Elliot Goodrich <elliotgoodrich@gmail.com> |
[llvm] Remove SmallSet from MachineInstr.h
`MachineInstr.h` is a commonly included file and this includes `llvm/ADT/SmallSet.h` for one function `getUsedDebugRegs()`, which is used only in one place
[llvm] Remove SmallSet from MachineInstr.h
`MachineInstr.h` is a commonly included file and this includes `llvm/ADT/SmallSet.h` for one function `getUsedDebugRegs()`, which is used only in one place.
According to `ClangBuildAnalyzer` (run solely on building LLVM, no other projects) the second most expensive template to instantiate is the `SmallSet::insert` method used in the `inline` implementation in `getUsedDebugRegs()`:
``` **** Templates that took longest to instantiate: 554239 ms: std::unordered_map<int, int> (2826 times, avg 196 ms) 521187 ms: llvm::SmallSet<llvm::Register, 4>::insert (930 times, avg 560 ms) ... ```
By removing this method and putting its implementation in the one call site we greatly reduce the template instantiation time and reduce the number of includes.
When copying the implementation, I removed a check on `MO.getReg()` as this is checked within `MO.isVirtual()`.
Differential Revision: https://reviews.llvm.org/D157720
show more ...
|
Revision tags: llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3 |
|
#
1e4de2d2 |
| 26-Apr-2023 |
Craig Topper <craig.topper@sifive.com> |
[X86] Remove unused SmallString. Fold a Twine local variable into call. NFC
|
Revision tags: llvmorg-16.0.2 |
|
#
caa9d6e2 |
| 15-Apr-2023 |
Akshay Khadse <akshayskhadse@gmail.com> |
Fix uninitialized pointer members in Target/X86
Reviewed By: LuoYuanke
Differential Revision: https://reviews.llvm.org/D148312
|
Revision tags: llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5 |
|
#
aaaf9ced |
| 27-May-2022 |
Luo, Yuanke <yuanke.luo@intel.com> |
[X86][AMX] Replace LDTILECFG with PLDTILECFGV on auto-config.
There is intrinsic `@llvm.x86.ldtilecfg` which is lowered to LDTILECFG. This intrinsic is open for user to configure tile registers by t
[X86][AMX] Replace LDTILECFG with PLDTILECFGV on auto-config.
There is intrinsic `@llvm.x86.ldtilecfg` which is lowered to LDTILECFG. This intrinsic is open for user to configure tile registers by themselves. There is a chance that `@llvm.x86.ldtilecfg` would be mixed with the new AMX intrinsics which depend on compiler to configure tile registers. Separate pusedo instruction PLDTILECFGV would avoid unexpected behavious when `@llvm.x86.ldtilecfg` is mixed with new AMX intrinsics. Though user should not mix the two programming model, compiler should avoid crash or UB when they are mixed.
Differential Revision: https://reviews.llvm.org/D126519
show more ...
|
Revision tags: llvmorg-14.0.4 |
|
#
373ce147 |
| 04-May-2022 |
Luo, Yuanke <yuanke.luo@intel.com> |
[X86][AMX] Replace PXOR instruction with SET0 in AMX pre config.
To generate zero value, the PXOR instruction need 3 operands that is tied to the same vreg. If is not good in SSA form and with undef
[X86][AMX] Replace PXOR instruction with SET0 in AMX pre config.
To generate zero value, the PXOR instruction need 3 operands that is tied to the same vreg. If is not good in SSA form and with undef value two address instruction pass may convert `%0:vr128 = PXORrr undef %0, undef %0` to `%1:vr128 = PXORrr undef %1:vr128(tied-def 0), undef %0:vr128`. It is not expected. It can be simplified to SET0 instruction which only take 1 destination operand. It should be more friendly to two address instruction pass and register allocation pass. `%0:vr128 = V_SET0` Also add AVX1 code path so that it is consistant to other code.
Differential Revision: https://reviews.llvm.org/D124903
show more ...
|
Revision tags: llvmorg-14.0.3, llvmorg-14.0.2 |
|
#
f3ad7ea0 |
| 24-Apr-2022 |
Luo, Yuanke <yuanke.luo@intel.com> |
[X86][AMX] Report error when shapes are not pre-defined.
Instead of report fatal error, this patch emit error message and exit when shapes are not pre-defined. This would cause the compiling fail bu
[X86][AMX] Report error when shapes are not pre-defined.
Instead of report fatal error, this patch emit error message and exit when shapes are not pre-defined. This would cause the compiling fail but not crash.
Differential Revision: https://reviews.llvm.org/D124342
show more ...
|
Revision tags: llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1 |
|
#
c4dba471 |
| 17-Nov-2021 |
Luo, Yuanke <yuanke.luo@intel.com> |
[X86][AMX] Don't emit tilerelease for old AMX instrisic.
We should avoid mixing old AMX instrinsic with new AMX intrinsic. For old AMX intrinsic, user is responsible for invoking tile release. This
[X86][AMX] Don't emit tilerelease for old AMX instrisic.
We should avoid mixing old AMX instrinsic with new AMX intrinsic. For old AMX intrinsic, user is responsible for invoking tile release. This patch is to check if there is any tile config generated by compiler. If so it emit tilerelease instruction, otherwise it don't emit the instruction.
Differential Revision: https://reviews.llvm.org/D114066
show more ...
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4 |
|
#
db23f277 |
| 17-Sep-2021 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[X86] X86PreTileConfig - Use const-ref iterator in for-range loop. NFCI.
Avoid unnecessary copies, reported by MSVC static analyzer.
|
Revision tags: llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2 |
|
#
4ed2b6cc |
| 26-May-2021 |
Luo, Yuanke <yuanke.luo@intel.com> |
[X86][AMX] Fix a bug on tile config.
The previous code detect if a MBB is bottom block to determine if it is a backedge of a loop. We should check latch block instead of bottom block and we should c
[X86][AMX] Fix a bug on tile config.
The previous code detect if a MBB is bottom block to determine if it is a backedge of a loop. We should check latch block instead of bottom block and we should check the header and the bottom block are in the same loop.
Differential Revision: https://reviews.llvm.org/D103145
show more ...
|
Revision tags: llvmorg-12.0.1-rc1 |
|
#
f69adfb8 |
| 28-Apr-2021 |
Wang, Pengfei <pengfei.wang@intel.com> |
[X86][AMX][NFC] Add more comments and remove unnecessary check found by Clocwork
|
#
016092d7 |
| 27-Apr-2021 |
Wang, Pengfei <pengfei.wang@intel.com> |
Reapply "[X86][AMX] Try to hoist AMX shapes' def"
We request no intersections between AMX instructions and their shapes' def when we insert ldtilecfg. However, this is not always ture resulting from
Reapply "[X86][AMX] Try to hoist AMX shapes' def"
We request no intersections between AMX instructions and their shapes' def when we insert ldtilecfg. However, this is not always ture resulting from not only users don't follow AMX API model, but also optimizations.
This patch adds a mechanism that tries to hoist AMX shapes' def as well. It only hoists shapes inside a BB, we can improve it for cases across BBs in future. Currently, it only hoists shapes of which all sources' def above the first AMX instruction. We can improve for the case that only source that moves an immediate value to a register below AMX instruction.
Reviewed By: xiangzhangllvm
Differential Revision: https://reviews.llvm.org/D101067
show more ...
|
#
caea37b3 |
| 23-Apr-2021 |
Mitch Phillips <31459023+hctim@users.noreply.github.com> |
Revert "[X86][AMX] Try to hoist AMX shapes' def"
This reverts commit 90118563ad0f133c696e070ad72761fa0daa4517.
Reason: Broke the MSan buildbots. https://lab.llvm.org/buildbot/#/builders/5/builds/69
Revert "[X86][AMX] Try to hoist AMX shapes' def"
This reverts commit 90118563ad0f133c696e070ad72761fa0daa4517.
Reason: Broke the MSan buildbots. https://lab.llvm.org/buildbot/#/builders/5/builds/6967/steps/9/logs/stdio
More details can be found in the original phabricator review: https://reviews.llvm.org/D101067
show more ...
|
#
151e244f |
| 23-Apr-2021 |
Wang, Pengfei <pengfei.wang@intel.com> |
[X86][AMX][NFC] Make comparison operators to be complete
The previous D101039 didn't fix the SmallSet insertion issue, due to we always return false for the comparison between 2 different nonnull BB
[X86][AMX][NFC] Make comparison operators to be complete
The previous D101039 didn't fix the SmallSet insertion issue, due to we always return false for the comparison between 2 different nonnull BBs. This patch makes the the comparison to be complete by comparing `MBB` first, so that we can always get the invariant order by a single operator.
show more ...
|
#
90118563 |
| 22-Apr-2021 |
Wang, Pengfei <pengfei.wang@intel.com> |
[X86][AMX] Try to hoist AMX shapes' def
We request no intersections between AMX instructions and their shapes' def when we insert ldtilecfg. However, this is not always ture resulting from not only
[X86][AMX] Try to hoist AMX shapes' def
We request no intersections between AMX instructions and their shapes' def when we insert ldtilecfg. However, this is not always ture resulting from not only users don't follow AMX API model, but also optimizations.
This patch adds a mechanism that tries to hoist AMX shapes' def as well. It only hoists shapes inside a BB, we can improve it for cases across BBs in future. Currently, it only hoists shapes of which all sources' def above the first AMX instruction. We can improve for the case that only source that moves an immediate value to a register below AMX instruction.
Differential Revision: https://reviews.llvm.org/D101067
show more ...
|
#
aafb6d81 |
| 22-Apr-2021 |
Wang, Pengfei <pengfei.wang@intel.com> |
[X86][AMX][NFC] Remove assert for comparison between different BBs.
SmallSet may use operator `<` when we insert MIRef elements, so we cannot limit the comparison between different BBs.
We allow MI
[X86][AMX][NFC] Remove assert for comparison between different BBs.
SmallSet may use operator `<` when we insert MIRef elements, so we cannot limit the comparison between different BBs.
We allow MIRef() to be less that any initialized MIRef object, otherwise, we always reture false when compare between different BBs.
Differential Revision: https://reviews.llvm.org/D101039
show more ...
|