History log of /llvm-project/llvm/test/CodeGen/AMDGPU/flat-scratch-reg.ll (Results 1 – 25 of 29)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6
# 41ed16c3 10-Dec-2024 Jun Wang <jwang86@yahoo.com>

Reapply "[AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (#94647)" (#118907)

This reverts commit 1ef9410a96c1d9669a6feaf03fcab8d0a4a13bd5.

This fixes the test file attrib

Reapply "[AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (#94647)" (#118907)

This reverts commit 1ef9410a96c1d9669a6feaf03fcab8d0a4a13bd5.

This fixes the test file attributor-flatscratchinit-globalisel.ll.

show more ...


# 1ef9410a 04-Dec-2024 Philip Reames <preames@rivosinc.com>

Revert "[AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (#94647)"

This reverts commit e6aec2c12095cc7debd1a8004c8535eef41f4c36. Commit breaks "ninja check-llvm" on x86 host.


# e6aec2c1 04-Dec-2024 Jun Wang <jwang86@yahoo.com>

[AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (#94647)

The AMDGPUAnnotateKernelFeatures pass infers the "amdgpu-calls" and
"amdgpu-stack-objects" attributes, which are us

[AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (#94647)

The AMDGPUAnnotateKernelFeatures pass infers the "amdgpu-calls" and
"amdgpu-stack-objects" attributes, which are used to infer whether we
need to initialize flat scratch. This is, however, not precise. Instead,
we should use AMDGPUAttributor and infer amdgpu-no-flat-scratch-init on
kernels. Refer to https://github.com/llvm/llvm-project/issues/63586 .

show more ...


Revision tags: llvmorg-19.1.5
# 5a3299a6 26-Nov-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Remove some -verify-machineinstrs from tests (#117736)

We should leave these for EXPENSIVE_CHECKS builds. Some of these
were near the top of slowest tests.


Revision tags: llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1
# c897c13d 30-Sep-2024 Janek van Oirschot <janek.vanoirschot@amd.com>

[AMDGPU] Convert AMDGPUResourceUsageAnalysis pass from Module to MF pass (#102913)

Converts AMDGPUResourceUsageAnalysis pass from Module to MachineFunction
pass. Moves function resource info propag

[AMDGPU] Convert AMDGPUResourceUsageAnalysis pass from Module to MF pass (#102913)

Converts AMDGPUResourceUsageAnalysis pass from Module to MachineFunction
pass. Moves function resource info propagation to to MC layer (through
helpers in AMDGPUMCResourceInfo) by generating MCExprs for every
function resource which the emitters have been prepped for.

Fixes https://github.com/llvm/llvm-project/issues/64863

show more ...


Revision tags: llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6
# 2cbfe4a8 09-May-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] Remove duplicate -mtriple options in tests (#91576)


Revision tags: llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1
# 4490003a 06-Mar-2024 Emma Pilkington <emma.pilkington95@gmail.com>

[AMDGPU] Rename COV module flag to amdhsa_code_object_version (#79905)

The previous name 'amdgpu_code_object_version', was misleading since
this is really a property of the HSA OS. The new spelling

[AMDGPU] Rename COV module flag to amdhsa_code_object_version (#79905)

The previous name 'amdgpu_code_object_version', was misleading since
this is really a property of the HSA OS. The new spelling also matches
the asm directive I added in bc82cfb.

show more ...


Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2
# fe2f67e4 21-Sep-2023 Pierre van Houtryve <pierre.vanhoutryve@amd.com>

[AMDGPU] Remove Code Object V2 (#65715)

Code Object V2 has been deprecated for more than a year now. We can
safely remove it from LLVM.

- [clang] Remove support for the `-mcode-object-version=2`

[AMDGPU] Remove Code Object V2 (#65715)

Code Object V2 has been deprecated for more than a year now. We can
safely remove it from LLVM.

- [clang] Remove support for the `-mcode-object-version=2` option.
- [lld] Remove/refactor tests that were still using COV2
- [llvm] Update AMDGPUUsage.rst
- Code Object V2 docs are left for informational purposes because those
code objects may still be supported by the runtime/loaders for a while.
- [AMDGPU] Remove COV2 emission capabilities.
- [AMDGPU] Remove `MetadataStreamerYamlV2` which was only used by COV2
- [AMDGPU] Update all tests that were still using COV2 - They are either
deleted or ported directly to code object v4 (as v3 is also planned to
be removed soon).

show more ...


Revision tags: llvmorg-17.0.1, llvmorg-17.0.0
# ceb68eea 15-Sep-2023 Jay Foad <jay.foad@amd.com>

[AMDGPU] Remove repeated -mtriple options from RUN lines (#66486)


# 806761a7 11-Sep-2023 Fangrui Song <i@maskray.me>

[test] Change llc -march= to -mtriple=

The issue is uncovered by #47698: for IR files without a target triple,
-mtriple= specifies the full target triple while -march= merely sets the
architecture p

[test] Change llc -march= to -mtriple=

The issue is uncovered by #47698: for IR files without a target triple,
-mtriple= specifies the full target triple while -march= merely sets the
architecture part of the default target triple, leaving a target triple which
may not make sense, e.g. riscv64-apple-darwin.

Therefore, -march= is error-prone and not recommended for tests without a target
triple. The issue has been benign as we recognize $unknown-apple-darwin as ELF instead
of rejecting it outrightly.

show more ...


Revision tags: llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2
# 54cf69c9 03-Feb-2023 Changpeng Fang <changpeng.fang@amd.com>

AMDGPU: Use module flag to get code object version at IR level

Summary:
This patch introduces a mechanism to check the code object version from the module flag, This avoids checking from command l

AMDGPU: Use module flag to get code object version at IR level

Summary:
This patch introduces a mechanism to check the code object version from the module flag, This avoids checking from command line.
In case the module flag is missing, we use the current default code object version supported in the compiler.

For tools whose inputs are not IR, we may need other approach (directive, for example) to check the code
object version, That will be in a separate patch later.

For LIT tests update, we directly add module flag if there is only a single code object version associated with all checks in one file.
In cause of multiple code object version in one file, we use the "sed" method to "clone" the checks to achieve the goal.

Reviewer: arsenm

Differential Revision:
https://reviews.llvm.org/D14313

show more ...


Revision tags: llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4
# 082e22f3 24-Sep-2021 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Always reserve flat scratch SGPR for architected flat scratch

With architected flat scratch it becomes readonly. We must always
reserve SGPR pair for it even if we do not use scratch at all

[AMDGPU] Always reserve flat scratch SGPR for architected flat scratch

With architected flat scratch it becomes readonly. We must always
reserve SGPR pair for it even if we do not use scratch at all since
an attempt to write to SGPRs mapped to FLAT_SCRATCH results in
memory violation.

This is not needed since GFX10 with architected flat scratch though
since special SGPRs are not carving space from normal SGPRs.

Differential Revision: https://reviews.llvm.org/D110376

show more ...


Revision tags: llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4
# f4ace637 24-Mar-2021 Konstantin Zhuravlyov <kzhuravl_dev@outlook.com>

AMDGPU: Add target id and code object v4 support

- Add target id support (https://clang.llvm.org/docs/ClangOffloadBundler.html#target-id)
- Add code object v4 support (https://llvm.org/docs/AMDG

AMDGPU: Add target id and code object v4 support

- Add target id support (https://clang.llvm.org/docs/ClangOffloadBundler.html#target-id)
- Add code object v4 support (https://llvm.org/docs/AMDGPUUsage.html#elf-code-object)
- Add kernarg_size to kernel descriptor
- Change trap handler ABI to no longer move queue pointer into s[0:1]
- Cleanup ELF definitions
- Add V2, V3, V4 suffixes to make a clear distinction for code object version
- Consolidate note names

Differential Revision: https://reviews.llvm.org/D95638

show more ...


Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2
# 2291bd13 30-Nov-2020 Austin Kerbow <Austin.Kerbow@amd.com>

[AMDGPU] Update subtarget features for new target ID support

Support for XNACK and SRAMECC is not static on some GPUs. We must be able
to differentiate between different scenarios for these dynamic

[AMDGPU] Update subtarget features for new target ID support

Support for XNACK and SRAMECC is not static on some GPUs. We must be able
to differentiate between different scenarios for these dynamic subtarget
features.

The possible settings are:

- Unsupported: The GPU has no support for XNACK/SRAMECC.
- Any: Preference is unspecified. Use conservative settings that can run anywhere.
- Off: Request support for XNACK/SRAMECC Off
- On: Request support for XNACK/SRAMECC On

GCNSubtarget will track the four options based on the following criteria. If
the subtarget does not support XNACK/SRAMECC we say the setting is
"Unsupported". If no subtarget features for XNACK/SRAMECC are requested we
must support "Any" mode. If the subtarget features XNACK/SRAMECC exist in the
feature string when initializing the subtarget, the settings are "On/Off".

The defaults are updated to be conservatively correct, meaning if no setting
for XNACK or SRAMECC is explicitly requested, defaults will be used which
generate code that can be run anywhere. This corresponds to the "Any" setting.

Differential Revision: https://reviews.llvm.org/D85882

show more ...


Revision tags: llvmorg-11.0.1-rc1
# 3fdf3b15 14-Oct-2020 Konstantin Zhuravlyov <kzhuravl_dev@outlook.com>

AMDGPU: Update AMDHSA code object version handling

Differential Revision: https://reviews.llvm.org/D89076


Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1, llvmorg-7.0.1, llvmorg-7.0.1-rc3
# a25e0524 15-Nov-2018 Konstantin Zhuravlyov <kzhuravl_dev@outlook.com>

AMDGPU: Enable code object v3 for AMDHSA only

Differential Revision: https://reviews.llvm.org/D54186

llvm-svn: 346923


Revision tags: llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1
# 2d22d24a 30-Oct-2018 Konstantin Zhuravlyov <kzhuravl_dev@outlook.com>

Revert r345542: AMDGPU: Enable code object v3 by default

It breaks mesa.

llvm-svn: 345662


# 5cb95020 29-Oct-2018 Konstantin Zhuravlyov <kzhuravl_dev@outlook.com>

AMDGPU: Enable code object v3 by default

Differential Revision: https://reviews.llvm.org/D53525

llvm-svn: 345542


Revision tags: llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2, llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1, llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2, llvmorg-5.0.1-rc1, llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2, llvmorg-5.0.0-rc1, llvmorg-4.0.1, llvmorg-4.0.1-rc3
# 3c7581bb 08-Jun-2017 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Use correct register names in inline assembly

Fixes using physical registers in inline asm from clang.

llvm-svn: 305004


Revision tags: llvmorg-4.0.1-rc2, llvmorg-4.0.1-rc1
# a3566f21 17-Apr-2017 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Use MachineRegisterInfo to find max used register

Avoid looping through program to determine register counts.
This avoids needing to look at regmask operands.

Also fixes some counting error

AMDGPU: Use MachineRegisterInfo to find max used register

Avoid looping through program to determine register counts.
This avoids needing to look at regmask operands.

Also fixes some counting errors with flat_scr when there
are no stack objects.

llvm-svn: 300482

show more ...


# 3dbeefa9 21-Mar-2017 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel

Currently the default C calling convention functions are treated
the same as compute kernels. Make this explicit so the default
ca

AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel

Currently the default C calling convention functions are treated
the same as compute kernels. Make this explicit so the default
calling convention can be changed to a non-kernel.

Converted with perl -pi -e 's/define void/define amdgpu_kernel void/'
on the relevant test directories (and undoing in one place that actually
wanted a non-kernel).

llvm-svn: 298444

show more ...


Revision tags: llvmorg-4.0.0, llvmorg-4.0.0-rc4, llvmorg-4.0.0-rc3, llvmorg-4.0.0-rc2
# 7aad8fd8 24-Jan-2017 Matt Arsenault <Matthew.Arsenault@amd.com>

Enable FeatureFlatForGlobal on Volcanic Islands

This switches to the workaround that HSA defaults to
for the mesa path.

This should be applied to the 4.0 branch.

Patch by Vedran Miletić <vedran@mi

Enable FeatureFlatForGlobal on Volcanic Islands

This switches to the workaround that HSA defaults to
for the mesa path.

This should be applied to the 4.0 branch.

Patch by Vedran Miletić <vedran@miletic.net>

llvm-svn: 292982

show more ...


Revision tags: llvmorg-4.0.0-rc1
# 0f55fbae 09-Dec-2016 Marek Olsak <marek.olsak@amd.com>

AMDGPU/SI: Don't reserve XNACK when it's disabled

Summary:
This frees 2 additional scalar registers.

These are results from all of my 3 patches combined:

Polaris:
Spilled SGPRs: 2231 -> 1517

AMDGPU/SI: Don't reserve XNACK when it's disabled

Summary:
This frees 2 additional scalar registers.

These are results from all of my 3 patches combined:

Polaris:
Spilled SGPRs: 2231 -> 1517 (-32.00 %)

Tonga:
Spilled SGPRs: 3829 -> 2608 (-31.89 %)
Spilled VGPRs: 100 -> 84 (-16.00 %)

Tonga even spills SGPRs via VGPRs to scratch. That's a compute shader
limited to 64 VGPRs.

Reviewers: tstellarAMD

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D27151

llvm-svn: 289262

show more ...


# 693e9be9 09-Dec-2016 Marek Olsak <marek.olsak@amd.com>

AMDGPU/SI: Don't reserve FLAT_SCR on non-HSA targets & without stack objects

Summary: This frees 2 scalar registers.

Reviewers: tstellarAMD

Subscribers: qcolombet, arsenm, kzhuravl, wdng, nhaehnle

AMDGPU/SI: Don't reserve FLAT_SCR on non-HSA targets & without stack objects

Summary: This frees 2 scalar registers.

Reviewers: tstellarAMD

Subscribers: qcolombet, arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D27150

llvm-svn: 289261

show more ...


Revision tags: llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1
# 5e832e86 06-Sep-2016 Wei Ding <wei.ding2@amd.com>

AMDGPU : Add XNACK feature to GPUs that support it.

Differential Revision: http://reviews.llvm.org/D24276

llvm-svn: 280742


12