Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6 |
|
#
009368f1 |
| 09-Dec-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Mark grid size loads with range metadata (#113019)
Only handles the v5 case.
|
Revision tags: llvmorg-19.1.5, llvmorg-19.1.4 |
|
#
be187369 |
| 14-Nov-2024 |
Kazu Hirata <kazu@google.com> |
[AMDGPU] Remove unused includes (NFC) (#116154)
Identified with misc-include-cleaner.
|
Revision tags: llvmorg-19.1.3 |
|
#
6924fc03 |
| 16-Oct-2024 |
Rahul Joshi <rjoshi@nvidia.com> |
[LLVM] Add `Intrinsic::getDeclarationIfExists` (#112428)
Add `Intrinsic::getDeclarationIfExists` to lookup an existing
declaration of an intrinsic in a `Module`.
|
Revision tags: llvmorg-19.1.2 |
|
#
8d13e7b8 |
| 03-Oct-2024 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Qualify auto. NFC. (#110878)
Generated automatically with:
$ clang-tidy -fix -checks=-*,llvm-qualified-auto $(find
lib/Target/AMDGPU/ -type f)
|
Revision tags: llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2 |
|
#
62e9f409 |
| 29-Jul-2024 |
Yingwei Zheng <dtcxzyw2333@gmail.com> |
[PatternMatch] Use `m_SpecificCmp` matchers. NFC. (#100878)
Compile-time improvement:
http://llvm-compile-time-tracker.com/compare.php?from=13996378d81c8fa9a364aeaafd7382abbc1db83a&to=861ffa4ec5f7b
[PatternMatch] Use `m_SpecificCmp` matchers. NFC. (#100878)
Compile-time improvement:
http://llvm-compile-time-tracker.com/compare.php?from=13996378d81c8fa9a364aeaafd7382abbc1db83a&to=861ffa4ec5f7bde5a194a7715593a1b5359eb581&stat=instructions:u
baseline: 803eaf29267c6aae9162d1a83a4a2ae508b440d3
```
Top 5 improvements:
stockfish/movegen.ll 2541620819 2538599412 -0.12%
minetest/profiler.cpp.ll 431724935 431246500 -0.11%
abc/luckySwap.c.ll 581173720 580581935 -0.10%
abc/kitTruth.c.ll 2521936288 2519445570 -0.10%
abc/extraUtilTruth.c.ll 1216674614 1215495502 -0.10%
Top 5 regressions:
openssl/libcrypto-shlib-sm4.ll 1155054721 1155943201 +0.08%
openssl/libcrypto-lib-sm4.ll 1155054838 1155943063 +0.08%
spike/vsm4r_vv.ll 1296430080 1297039258 +0.05%
spike/vsm4r_vs.ll 1312496906 1313093460 +0.05%
nuttx/lib_rand48.c.ll 126201233 126246692 +0.04%
Overall: -0.02112308%
```
show more ...
|
Revision tags: llvmorg-19.1.0-rc1, llvmorg-20-init |
|
#
9df71d76 |
| 28-Jun-2024 |
Nikita Popov <npopov@redhat.com> |
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, re
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.
show more ...
|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init |
|
#
bc82cfb3 |
| 21-Jan-2024 |
Emma Pilkington <emma.pilkington95@gmail.com> |
[AMDGPU] Add an asm directive to track code_object_version (#76267)
Named '.amdhsa_code_object_version'. This directive sets the
e_ident[ABIVERSION] in the ELF header, and should be used as the ass
[AMDGPU] Add an asm directive to track code_object_version (#76267)
Named '.amdhsa_code_object_version'. This directive sets the
e_ident[ABIVERSION] in the ELF header, and should be used as the assumed
COV for the rest of the asm file.
This commit also weakens the --amdhsa-code-object-version CL flag.
Previously, the CL flag took precedence over the IR flag. Now the IR
flag/asm directive take precedence over the CL flag. This is implemented
by merging a few COV-checking functions in AMDGPUBaseInfo.h.
show more ...
|
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5 |
|
#
d08496b5 |
| 01-Nov-2023 |
Nikita Popov <npopov@redhat.com> |
[AMDGPULowerKernelAttribute] Avoid use of ConstantExpr::getIntegerCast() (NFC)
We are working on ConstantInt here, so folding will always succeed. This just avoids use of the ConstantExpr API.
|
Revision tags: llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3 |
|
#
7ca3444f |
| 10-Feb-2023 |
Changpeng Fang <changpeng.fang@amd.com> |
AMDGPU: Use module flag to get code object version at IR level folow-up
Summary: This is part of the leftover work for https://reviews.llvm.org/D143138. In this work, we pass code object version a
AMDGPU: Use module flag to get code object version at IR level folow-up
Summary: This is part of the leftover work for https://reviews.llvm.org/D143138. In this work, we pass code object version as an argument to initialize target ID and use it for targetID dump.
Reviewers: arsenm
Differential Revision https://reviews.llvm.org/D143293
show more ...
|
Revision tags: llvmorg-16.0.0-rc2 |
|
#
54cf69c9 |
| 03-Feb-2023 |
Changpeng Fang <changpeng.fang@amd.com> |
AMDGPU: Use module flag to get code object version at IR level
Summary: This patch introduces a mechanism to check the code object version from the module flag, This avoids checking from command l
AMDGPU: Use module flag to get code object version at IR level
Summary: This patch introduces a mechanism to check the code object version from the module flag, This avoids checking from command line. In case the module flag is missing, we use the current default code object version supported in the compiler.
For tools whose inputs are not IR, we may need other approach (directive, for example) to check the code object version, That will be in a separate patch later.
For LIT tests update, we directly add module flag if there is only a single code object version associated with all checks in one file. In cause of multiple code object version in one file, we use the "sed" method to "clone" the checks to achieve the goal.
Reviewer: arsenm
Differential Revision: https://reviews.llvm.org/D14313
show more ...
|
Revision tags: llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working |
|
#
fa2b1cb8 |
| 05-Oct-2022 |
Juan Manuel MARTINEZ CAAMAÑO <juamarti@amd.com> |
[NFC][AMDGPULowerKernelAttributes] Factorize repeated code into function
Differential Revision: https://reviews.llvm.org/D135266
|
Revision tags: llvmorg-15.0.2 |
|
#
dee4bc4a |
| 26-Sep-2022 |
Changpeng Fang <changpeng.fang@amd.com> |
AMDGPU: Handle new address pattern in LowerKernelAttributes introduced by opaque pointers
Summary: With opaque pointer support, the "ptr" type is introduced and thus BitCast is not necessary in so
AMDGPU: Handle new address pattern in LowerKernelAttributes introduced by opaque pointers
Summary: With opaque pointer support, the "ptr" type is introduced and thus BitCast is not necessary in some cases. This work takes care of this change, and recognizes the new address patterns to do appropriate optimizations.
Reviewers: arsenm
Differential Revision: https://reviews.llvm.org/D134596
show more ...
|
#
3ae4c358 |
| 21-Sep-2022 |
Changpeng Fang <changpeng.fang@amd.com> |
AMDGPU: Implicit kernel arguments related optimization when uniform-workgroup-size=true
Summary: Under code object version 5, ockl_get_local_size returns the value computed by the expression: wo
AMDGPU: Implicit kernel arguments related optimization when uniform-workgroup-size=true
Summary: Under code object version 5, ockl_get_local_size returns the value computed by the expression: workgroup_id < hidden_block_count ? hidden_group_size : hidden_remainder For functions with the attribute uniform-work-group-size=true. we can evaluate workgroup_id < hidden_block_count as true, and thus hidden_group_size is returned for ockl_get_local_size. With uniform-workgroup-size=true, this work also set all remainders to zero, and if there is reqd_work_group_size, we also set work-group-size to the required value from the metadata.
Reviewers: arsenm and bcahoon
Differential Revision: https://reviews.llvm.org/D131276
show more ...
|
Revision tags: llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
#
c986d476 |
| 07-Apr-2022 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Update reqd-work-group-size optimization for umin intrinsic
This code was pattern matching the ID computation expression as it appears in the library. This was a compare and select, but now
AMDGPU: Update reqd-work-group-size optimization for umin intrinsic
This code was pattern matching the ID computation expression as it appears in the library. This was a compare and select, but now that umin is canonical, we were no longer matching. Update to match the intrinsic instead.
show more ...
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init |
|
#
aff66b7e |
| 06-Jul-2021 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Fix pass name of AMDGPULowerKernelAttributes. NFC.
This was obviously copy-pasted.
|
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4 |
|
#
d6de1e1a |
| 24-Mar-2021 |
Serge Guelton <sguelton@redhat.com> |
Normalize interaction with boolean attributes
Such attributes can either be unset, or set to "true" or "false" (as string). throughout the codebase, this led to inelegant checks ranging from
Normalize interaction with boolean attributes
Such attributes can either be unset, or set to "true" or "false" (as string). throughout the codebase, this led to inelegant checks ranging from
if (Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true")
to
if (Fn->hasAttribute("no-jump-tables") && Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true")
Introduce a getValueAsBool that normalize the check, with the following behavior:
no attributes or attribute set to "false" => return false attribute set to "true" => return true
Differential Revision: https://reviews.llvm.org/D99299
show more ...
|
Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2 |
|
#
560d7e04 |
| 20-Jan-2021 |
dfukalov <daniil.fukalov@amd.com> |
[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets
... to reduce headers dependency.
Reviewed By: rampitec, arsenm
Differential Revision: https://reviews.llvm.org/D95036
|
Revision tags: llvmorg-11.1.0-rc1 |
|
#
6a87e9b0 |
| 25-Dec-2020 |
dfukalov <daniil.fukalov@amd.com> |
[NFC][AMDGPU] Reduce include files dependency.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D93813
|
#
7ecbe0c7 |
| 28-Dec-2020 |
Arthur Eubanks <aeubanks@google.com> |
[NewPM][AMDGPU] Port amdgpu-lower-kernel-attributes
And add it to the AMDGPU opt pipeline.
This is a function pass instead of a module pass (like the legacy pass) because it's getting added to a CG
[NewPM][AMDGPU] Port amdgpu-lower-kernel-attributes
And add it to the AMDGPU opt pipeline.
This is a function pass instead of a module pass (like the legacy pass) because it's getting added to a CGSCCPassManager, and you can't put a module pass in a CGSCCPassManager.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D93885
show more ...
|
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1 |
|
#
2946cd70 |
| 19-Jan-2019 |
Chandler Carruth <chandlerc@gmail.com> |
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the ne
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository.
llvm-svn: 351636
show more ...
|
Revision tags: llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1, llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2 |
|
#
372d796a |
| 18-May-2018 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Add pass to optimize reqd_work_group_size
Eliminate loads from the dispatch packet when they will have a known value.
Also pattern match the code used by the library to handle partial workg
AMDGPU: Add pass to optimize reqd_work_group_size
Eliminate loads from the dispatch packet when they will have a known value.
Also pattern match the code used by the library to handle partial workgroup dispatches, which isn't necessary if reqd_work_group_size is used.
llvm-svn: 332771
show more ...
|