History log of /llvm-project/llvm/test/CodeGen/AMDGPU/enqueue-kernel.ll (Results 1 – 20 of 20)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7
# 73b0e8a1 13-Jan-2025 Akshat Oke <Akshat.Oke@amd.com>

[AMDGPU][NewPM] Port AMDGPUOpenCLEnqueuedBlockLowering to NPM (#122434)


Revision tags: llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# bbc7b30f 23-Dec-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Remove invalid testcase for enqueue kernel

The call didn't have the right calling convention, but calls to
kernels are supposed to be illegal anyway.


# 99d4c722 08-Jan-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Really invert handling of enqueued block detection

Remove the broken call graph analysis in the block enqueue lowering
pass. The previous iteration was reverted due to a runtime bug when
the

AMDGPU: Really invert handling of enqueued block detection

Remove the broken call graph analysis in the block enqueue lowering
pass. The previous iteration was reverted due to a runtime bug when
the completion action was unconditionally enabled.

show more ...


# 4fc07e18 23-Dec-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Use constant and externally_initialized for block handle

The runtime initializes this.


# 0cd3a39e 09-Jan-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Fix opaque pointer handling for enqueued blocks, again


# 270e96f4 08-Jan-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

Revert "AMDGPU: Invert handling of enqueued block detection"

This reverts commit 47288cc977fa31c44cc92b4e65044a5b75c2597e.

The runtime is having trouble with this at -O0 when the inputs are
always

Revert "AMDGPU: Invert handling of enqueued block detection"

This reverts commit 47288cc977fa31c44cc92b4e65044a5b75c2597e.

The runtime is having trouble with this at -O0 when the inputs are
always enabled.

show more ...


# 47554a0c 23-Dec-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Use more accurate IR type for block handle

The device library uses this as a struct with a pointer sized integer
and 2 ints.


# 47288cc9 23-Dec-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Invert handling of enqueued block detection

Invert the sense of the attribute and let the attributor figure this
out like everything else. If needed we can have the not-OpenCL
languages set

AMDGPU: Invert handling of enqueued block detection

Invert the sense of the attribute and let the attributor figure this
out like everything else. If needed we can have the not-OpenCL
languages set amdgpu-no-default-queue and amdgpu-no-completion-action
up front so they never have to pay the cost.

There are also so many of these now, the offset use API should
probably consider all of them at once. Maybe they should merge into
one attribute with used fields. Having separate functions for each
field in AMDGPUBaseInfo is also not the greatest API (might as well
fix this when the patch to get the object version from the module
lands).

show more ...


# 0416883d 23-Dec-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Fix enqueue block lowering for opaque pointers

This was looking for a specific constant cast of the function, when
the type doesn't matter. Doesn't bother trying to handle typed
pointers, it

AMDGPU: Fix enqueue block lowering for opaque pointers

This was looking for a specific constant cast of the function, when
the type doesn't matter. Doesn't bother trying to handle typed
pointers, it will just assert.

Things probably don't work completely correctly if the block kernel
address is captured somewhere else, but that wouldn't work before
either. The uses should really be loads out of the handle, and the
handle initializer should contain the kernel address.

show more ...


# 4ce5400a 23-Dec-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Convert enqueue-kernel.ll to opaque pointers

This demonstrates the pass is broken with them, the follow up change
will fix it.


# 1f93517b 23-Dec-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Switch enqueue kernel test to generated checks


# a74ea40c 29-Nov-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Remove unnecessary metadata from test

The pass isn't doing anything with it, and the line wrapping is
confusing update_test_checks.


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1
# 06c192d4 20-Nov-2020 Matt Arsenault <Matthew.Arsenault@amd.com>

OpaquePtr: Bulk update tests to use typed byval

Upgrade of the IR text tests should be the only thing blocking making
typed byval mandatory. Partially done through regex and partially
manual.


Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1, llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1, llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3
# fb17bf60 13-Jun-2018 Yaxun Liu <Yaxun.Liu@amd.com>

[AMDGPU] Change enqueue kernel handle type

Currently the handle type is a global pointer which holds 8 bytes.
We need a larger type which hold 16 bytes, therefore change it
to [i64 x 2].

Differenti

[AMDGPU] Change enqueue kernel handle type

Currently the handle type is a global pointer which holds 8 bytes.
We need a larger type which hold 16 bytes, therefore change it
to [i64 x 2].

Differential Revision: https://reviews.llvm.org/D48094

llvm-svn: 334625

show more ...


Revision tags: llvmorg-6.0.1-rc2, llvmorg-6.0.1-rc1
# 9381ae97 11-Apr-2018 Yaxun Liu <Yaxun.Liu@amd.com>

[AMDGPU] Fix lowering enqueue_kernel

Two issues were fixed:

runtime has difficulty to allocate memory for an external symbol of a
kernel and set the address of the external symbol, therefore make t

[AMDGPU] Fix lowering enqueue_kernel

Two issues were fixed:

runtime has difficulty to allocate memory for an external symbol of a
kernel and set the address of the external symbol, therefore make the runtime
handle of an enqueued kernel an ordinary global variable. Runtime only needs
to store the address of the loaded kernel to the handle and has verified
that this approach works.

handle the situation where __enqueue_kernel* gets inlined therefore
the enqueued kernel may be used through a constant expr instead
of an instruction.

Differential Revision: https://reviews.llvm.org/D45187

llvm-svn: 329815

show more ...


Revision tags: llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1
# a99e7d8e 12-Mar-2018 Yaxun Liu <Yaxun.Liu@amd.com>

[AMDGPU] Fix lowering enqueue kernel when kernel has no name

Since the enqueued kernels have internal linkage, their names may be dropped.
In this case, give them unique names __amdgpu_enqueued_kern

[AMDGPU] Fix lowering enqueue kernel when kernel has no name

Since the enqueued kernels have internal linkage, their names may be dropped.
In this case, give them unique names __amdgpu_enqueued_kernel or
__amdgpu_enqueued_kernel.n where n is a sequential number starting from 1.

Differential Revision: https://reviews.llvm.org/D44322

llvm-svn: 327291

show more ...


# 46439e8d 06-Mar-2018 Yaxun Liu <Yaxun.Liu@amd.com>

[AMDGPU] Fix lowering OpenCL enqueue_kernel

One addrspacecast disappeared in clang emitted IR for
block invoke function due to adoption of the new
addr space mapping.

Differential Revision: https:/

[AMDGPU] Fix lowering OpenCL enqueue_kernel

One addrspacecast disappeared in clang emitted IR for
block invoke function due to adoption of the new
addr space mapping.

Differential Revision: https://reviews.llvm.org/D43785

llvm-svn: 326806

show more ...


Revision tags: llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1
# e4b0231c 11-Jan-2018 Rafael Espindola <rafael.espindola@gmail.com>

Make internal/private GVs implicitly dso_local.

While updating clang tests for having clang set dso_local I noticed
that:

- There are *a lot* of tests to update.
- Many of the updates are redundant

Make internal/private GVs implicitly dso_local.

While updating clang tests for having clang set dso_local I noticed
that:

- There are *a lot* of tests to update.
- Many of the updates are redundant.

They are redundant because a GV is "obviously dso_local". This patch
starts formalizing that a bit by requiring that internal and private
GVs be dso_local too. Since they all are, we don't have to print
dso_local to the textual representation, making it a bit more compact
and easier to read.

llvm-svn: 322317

show more ...


Revision tags: llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2, llvmorg-5.0.1-rc1
# c928f2a6 30-Oct-2017 Yaxun Liu <Yaxun.Liu@amd.com>

[AMDGPU] Emit metadata for hidden arguments for kernel enqueue

Identifies kernels which performs device side kernel enqueues and emit
metadata for the associated hidden kernel arguments. Such kernel

[AMDGPU] Emit metadata for hidden arguments for kernel enqueue

Identifies kernels which performs device side kernel enqueues and emit
metadata for the associated hidden kernel arguments. Such kernels are
marked with calls-enqueue-kernel function attribute by
AMDGPUOpenCLEnqueueKernelLowering pass and later on
hidden kernel arguments metadata HiddenDefaultQueue and
HiddenCompletionAction are emitted for them.

Differential Revision: https://reviews.llvm.org/D39255

llvm-svn: 316907

show more ...


# de4b88d9 10-Oct-2017 Yaxun Liu <Yaxun.Liu@amd.com>

[AMDGPU] Lower enqueued blocks and generate runtime metadata

This patch adds a post-linking pass which replaces the function pointer of enqueued
block kernel with a global variable (runtime handle)

[AMDGPU] Lower enqueued blocks and generate runtime metadata

This patch adds a post-linking pass which replaces the function pointer of enqueued
block kernel with a global variable (runtime handle) and adds
runtime-handle attribute to the enqueued block kernel.

In LLVM CodeGen the runtime-handle metadata will be translated to
RuntimeHandle metadata in code object. Runtime allocates a global buffer
for each kernel with RuntimeHandel metadata and saves the kernel address
required for the AQL packet into the buffer. __enqueue_kernel function
in device library knows that the invoke function pointer in the block
literal is actually runtime handle and loads the kernel address from it
and puts it into AQL packet for dispatching.

This cannot be done in FE since FE cannot create a unique global variable
with external linkage across LLVM modules. The global variable with internal
linkage does not work since optimization passes will try to replace loads
of the global variable with its initialization value.

Differential Revision: https://reviews.llvm.org/D38610

llvm-svn: 315352

show more ...