History log of /llvm-project/llvm/unittests/Transforms/Utils/CodeExtractorTest.cpp (Results 1 – 18 of 18)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# f6795e6b 12-Nov-2024 Michael Kruse <llvm-project@meinersbur.de>

[CodeExtractor] Refactor extractCodeRegion, fix alloca emission. (#114419)

Reorganize the code into phases:

* Analyze/normalize
* Create extracted function prototype
* Generate the new funct

[CodeExtractor] Refactor extractCodeRegion, fix alloca emission. (#114419)

Reorganize the code into phases:

* Analyze/normalize
* Create extracted function prototype
* Generate the new function's implementation
* Generate call to new function
* Connect call to original function's CFG

The motivation is #114669 to optionally clone the selected code region
into the new function instead of moving it. The current structure made
it difficult to add such functionality since there was no obvious place
to do so, not made easier by some functions doing more than their name
suggests. For instance, constructFunction modifies code outside the
constructed function, but also function properties such as
setPersonalityFn are derived somewhere else. Another example is
emitCallAndSwitchStatement, which despite its name also inserts stores
for output parameters.

Many operations also implicitly depend on the order they are applied
which this patch tries to reduce. For instance, ExtractedFuncRetVals
becomes the list exit blocks which also defines the return value when
leaving via that block. It is computed early such that the new
function's return instructions and the switch can be generated
independently. Also, ExtractedFuncRetVals is combining the lists
ExitBlocks and OldTargets which were not always kept consistent with
each other or NumExitBlocks. The method recomputeExitBlocks() will
update it when necessary.

The coding style partially contradict the current coding standard. For
instance some local variable start with lower case letters. I updated
some, but not all occurrences to make the diff match at least some lines
as unchanged.

The patch [D96854](https://reviews.llvm.org/D96854) introduced some
confusion of function argument indexes this is fixed here as well, hence
the patch is not NFC anymore. Tested in modified CodeExtractorTest.cpp.
Patch [D121061](https://reviews.llvm.org/D121061) introduced
AllocationBlock, but not all allocas were inserted there.

Efectively includes the following fixes:
1. https://github.com/llvm/llvm-project/commit/ce73b1672a6053d5974dc2342881aac02efe2dbb
2. https://github.com/llvm/llvm-project/commit/4aaa92578686176243a294eeb2ca5697a99edcaa
3. Missing allocas, still unfixed

Originally submitted as https://reviews.llvm.org/D115218

show more ...


# 4aaa9257 04-Nov-2024 Tom Eccles <tom.eccles@arm.com>

[llvm][CodeExtractor] fix bug in parameter naming (#114237)

The code extractor tries to apply the names of source input and output
values to function arguments. Not all input and output values get

[llvm][CodeExtractor] fix bug in parameter naming (#114237)

The code extractor tries to apply the names of source input and output
values to function arguments. Not all input and output values get added
as arguments: some are instead placed inside of a struct passed to the
function. The existing renaming code skipped trying to set these
struct-packed arguments names (as there is no corresponding function
argument to rename), but it still incremented the iterator over the
function arguments. This could result in dereferencing an end iterator
if struct-packed inputs/outputs preceded non-struct-packed
inputs/outputs.

This patch rewrites this loop to avoid the end iterator dereference.

show more ...


Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# 32f9983c 15-Dec-2023 Jessica Del <50999226+OutOfCache@users.noreply.github.com>

[AMDGPU] - Add address space for strided buffers (#74471)

This is an experimental address space for strided buffers. These buffers
can have structs as elements and
a stride > 1.
These pointers al

[AMDGPU] - Add address space for strided buffers (#74471)

This is an experimental address space for strided buffers. These buffers
can have structs as elements and
a stride > 1.
These pointers allow the indexed access in units of stride, i.e., they
point at `buffer[index * stride]`.
Thus, we can use the `idxen` modifier for buffer loads.

We assign address space 9 to 192-bit buffer pointers which contain a
128-bit descriptor, a 32-bit offset and a 32-bit index. Essentially,
they are fat buffer pointers with an additional 32-bit index.

show more ...


Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4
# eee8dd90 18-Oct-2023 Dominik Adamski <dominik.adamski@amd.com>

[CodeExtractor] Allow to use 0 addr space for aggregate arg (#66998)

The user of CodeExtractor should be able to specify that
the aggregate argument should be passed as a pointer in zero address
s

[CodeExtractor] Allow to use 0 addr space for aggregate arg (#66998)

The user of CodeExtractor should be able to specify that
the aggregate argument should be passed as a pointer in zero address
space.

CodeExtractor is used to generate outlined functions required by OpenMP
runtime. The arguments of the outlined functions for OpenMP GPU code
are in 0 address space. 0 address space does not need to be the default
address space for GPU device. That's why there is a need to allow
the user of CodeExtractor to specify, that the allocated aggregate parameter
is passed as pointer in zero address space.

show more ...


Revision tags: llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2
# c1b96725 23-Feb-2022 Bill Wendling <isanbard@gmail.com>

[NFC] Add #include for constants


Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init
# 95b981ca 26-Jan-2022 Giorgis Georgakoudis <georgakoudis1@llnl.gov>

[CodeExtractor] Enable partial aggregate arguments

Summary:
Enable CodeExtractor to construct output functions that partially
aggregate inputs/outputs in their argument list. A use case is the
OMPIR

[CodeExtractor] Enable partial aggregate arguments

Summary:
Enable CodeExtractor to construct output functions that partially
aggregate inputs/outputs in their argument list. A use case is the
OMPIRBuilder to create outlined functions for parallel regions that
aggregate in a struct the payload variables for the region while passing
as scalars thread and bound identifiers.

Differential Revision: https://reviews.llvm.org/D96854

show more ...


Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2
# 144cd22b 23-Aug-2021 Andrew Litteken <andrew_litteken@apple.com>

[CodeExtractor] Creating exit stubs based off original order branch instructions.

Previously the CodeExtractor created exit stubs, and the subsequent return value of the outlined function based on t

[CodeExtractor] Creating exit stubs based off original order branch instructions.

Previously the CodeExtractor created exit stubs, and the subsequent return value of the outlined function based on the order of out-of-region blocks after splitting any phi nodes, and collecting the blocks to be outlined. This could cause differences in order if there was a difference of exit block phi nodes between the two regions. This patch moves the collection of the output target blocks to be before this occurs, so that the assignment of target block to output value will be the same, regardless of the contents of the output block.

Reviewers: paquette, roelofs

Differential Revision: https://reviews.llvm.org/D108657

show more ...


# 9d2c859e 26-Aug-2021 Andrew Litteken <andrew.litteken@gmail.com>

[CodeExtractor] Making the arguments outlined easier to access from the outside

The Code Extractor does not provide an easy mechanism for determining the
inputs and outputs after extraction has occu

[CodeExtractor] Making the arguments outlined easier to access from the outside

The Code Extractor does not provide an easy mechanism for determining the
inputs and outputs after extraction has occurred, this patch gives the
ability to pass in empty SetVectors to be filled with the inputs and
outputs if they need to be analyzed.

Added Tests:
- InputOutputMonitoring in unittests/Transforms/Utils/CodeExtractorTests.cpp

Reviewers: paquette

Differential Revision: https://reviews.llvm.org/D106991

show more ...


Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1
# 700d2417 03-Nov-2020 Giorgis Georgakoudis <georgakoudis1@llnl.gov>

[CodeExtractor] Replace uses of extracted bitcasts in out-of-region lifetime markers

CodeExtractor handles bitcasts in the extracted region that have
lifetime markers users in the outer region as ou

[CodeExtractor] Replace uses of extracted bitcasts in out-of-region lifetime markers

CodeExtractor handles bitcasts in the extracted region that have
lifetime markers users in the outer region as outputs. That
creates unnecessary alloca/reload instructions and extra lifetime
markers. The patch identifies those cases, and replaces uses in
out-of-region lifetime markers with new bitcasts in the outer region.

**Example**
```
define void @foo() {
entry:
%0 = alloca i32
br label %extract

extract:
%1 = bitcast i32* %0 to i8*
call void @llvm.lifetime.start.p0i8(i64 4, i8* %1)
call void @use(i32* %0)
br label %exit

exit:
call void @use(i32* %0)
call void @llvm.lifetime.end.p0i8(i64 4, i8* %1)
ret void
}
```

**Current extraction**
```
define void @foo() {
entry:
%.loc = alloca i8*, align 8
%0 = alloca i32, align 4
br label %codeRepl

codeRepl: ; preds = %entry
%lt.cast = bitcast i8** %.loc to i8*
call void @llvm.lifetime.start.p0i8(i64 -1, i8* %lt.cast)
%lt.cast1 = bitcast i32* %0 to i8*
call void @llvm.lifetime.start.p0i8(i64 -1, i8* %lt.cast1)
call void @foo.extract(i32* %0, i8** %.loc)
%.reload = load i8*, i8** %.loc, align 8
call void @llvm.lifetime.end.p0i8(i64 -1, i8* %lt.cast)
br label %exit

exit: ; preds = %codeRepl
call void @use(i32* %0)
call void @llvm.lifetime.end.p0i8(i64 4, i8* %.reload)
ret void
}

define internal void @foo.extract(i32* %0, i8** %.out) {
newFuncRoot:
br label %extract

exit.exitStub: ; preds = %extract
ret void

extract: ; preds = %newFuncRoot
%1 = bitcast i32* %0 to i8*
store i8* %1, i8** %.out, align 8
call void @use(i32* %0)
br label %exit.exitStub
}
```

**Extraction with patch**
```
define void @foo() {
entry:
%0 = alloca i32, align 4
br label %codeRepl

codeRepl: ; preds = %entry
%lt.cast1 = bitcast i32* %0 to i8*
call void @llvm.lifetime.start.p0i8(i64 -1, i8* %lt.cast1)
call void @foo.extract(i32* %0)
br label %exit

exit: ; preds = %codeRepl
call void @use(i32* %0)
%lt.cast = bitcast i32* %0 to i8*
call void @llvm.lifetime.end.p0i8(i64 4, i8* %lt.cast)
ret void
}

define internal void @foo.extract(i32* %0) {
newFuncRoot:
br label %extract

exit.exitStub: ; preds = %extract
ret void

extract: ; preds = %newFuncRoot
%1 = bitcast i32* %0 to i8*
call void @use(i32* %0)
br label %exit.exitStub
}
```

Reviewed By: vsk

Differential Revision: https://reviews.llvm.org/D90689

show more ...


Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1
# 8359511c 29-Jan-2020 Vedant Kumar <vsk@apple.com>

[CodeExtractor] Remove stale llvm.assume calls from extracted region

During extraction, stale llvm.assume handles may be retained in the
original function. The setup is:

1) CodeExtractor unregister

[CodeExtractor] Remove stale llvm.assume calls from extracted region

During extraction, stale llvm.assume handles may be retained in the
original function. The setup is:

1) CodeExtractor unregisters assumptions in the blocks that are to be
extracted.

2) Extraction happens. There are now two functions: f1 and f1.extracted.

3) Leftover assumptions in f1 (/not/ removed as they were not in the set of
blocks to be extracted) now have affected-value llvm.assume handles in
f1.extracted.

When assumptions for a value used in f1 are looked up, ValueTracking can assert
as some of the handles are in the wrong function. To fix this, simply erase the
llvm.assume calls in the extracted function.

Alternatives include flushing the assumption cache in the original function, or
walking all values used in the original function to prune stale affected-value
handles. Both seem more expensive.

Testing: check-llvm, LNT run with -mllvm -hot-cold-split enabled

rdar://58460728

show more ...


Revision tags: llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1
# 9852699d 08-Oct-2019 Vedant Kumar <vsk@apple.com>

[CodeExtractor] Factor out and reuse shrinkwrap analysis

Factor out CodeExtractor's analysis of allocas (for shrinkwrapping
purposes), and allow the analysis to be reused.

This resolves a quadratic

[CodeExtractor] Factor out and reuse shrinkwrap analysis

Factor out CodeExtractor's analysis of allocas (for shrinkwrapping
purposes), and allow the analysis to be reused.

This resolves a quadratic compile-time bug observed when compiling
AMDGPUDisassembler.cpp.o.

Pre-patch (Release + LTO clang):

```
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
176.5278 ( 57.8%) 0.4915 ( 18.5%) 177.0192 ( 57.4%) 177.4112 ( 57.3%) Hot Cold Splitting
```

Post-patch (ReleaseAsserts clang):

```
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
1.4051 ( 3.3%) 0.0079 ( 0.3%) 1.4129 ( 3.2%) 1.4129 ( 3.2%) Hot Cold Splitting
```

Testing: check-llvm, and comparing the AMDGPUDisassembler.cpp.o binary
pre- vs. post-patch.

An alternate approach is to hide CodeExtractorAnalysisCache from clients
of CodeExtractor, and to recompute the analysis from scratch inside of
CodeExtractor::extractCodeRegion(). This eliminates some redundant work
in the shrinkwrapping legality check. However, some clients continue to
exhibit O(n^2) compile time behavior as computing the analysis is O(n).

rdar://55912966

Differential Revision: https://reviews.llvm.org/D68616

llvm-svn: 374089

show more ...


# 50afaa9d 04-Oct-2019 Aditya Kumar <hiraditya@msn.com>

Add a unittest to verify for assumption cache

Reviewers: vsk, tejohnson

Reviewed By: vsk

Differential Revision: https://reviews.llvm.org/D68095

llvm-svn: 373811


Revision tags: llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1
# 0e5dd512 08-Feb-2019 Vedant Kumar <vsk@apple.com>

[CodeExtractor] Restore outputs after creating exit stubs

When CodeExtractor saves the result of InvokeInst at the first insertion
point of the 'normal destination' basic block, this block can be om

[CodeExtractor] Restore outputs after creating exit stubs

When CodeExtractor saves the result of InvokeInst at the first insertion
point of the 'normal destination' basic block, this block can be omitted
in the outlined region, so store is placed outside of the function. The
suggested solution is to process saving outputs after creating exit
stubs for new function, and stores will be placed in that blocks before
return in this case.

Patch by Sergei Kachkov!

Fixes llvm.org/PR40455.

Differential Revision: https://reviews.llvm.org/D57919

llvm-svn: 353562

show more ...


Revision tags: llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1
# 2946cd70 19-Jan-2019 Chandler Carruth <chandlerc@gmail.com>

Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the ne

Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636

show more ...


Revision tags: llvmorg-7.0.1, llvmorg-7.0.1-rc3
# b2a6f8e5 07-Dec-2018 Vedant Kumar <vsk@apple.com>

[CodeExtractor] Store outputs at the first valid insertion point

When CodeExtractor outlines values which are used by the original
function, it must store those values in some in-out parameter. This

[CodeExtractor] Store outputs at the first valid insertion point

When CodeExtractor outlines values which are used by the original
function, it must store those values in some in-out parameter. This
store instruction must not be inserted in between a PHI and an EH pad
instruction, as that results in invalid IR.

This fixes the following verifier failure seen while outlining within
ObjC methods with live exit values:

The unwind destination does not have an exception handling instruction!
%call35 = invoke i8* bitcast (i8* (i8*, i8*, ...)* @objc_msgSend to i8* (i8*, i8*)*)(i8* %exn.adjusted, i8* %1)
to label %invoke.cont34 unwind label %lpad33, !dbg !4183
The unwind destination does not have an exception handling instruction!
invoke void @objc_exception_throw(i8* %call35) #12
to label %invoke.cont36 unwind label %lpad33, !dbg !4184
LandingPadInst not the first non-PHI instruction in the block.
%3 = landingpad { i8*, i32 }
catch i8* null, !dbg !1411

rdar://46540815

llvm-svn: 348562

show more ...


# d129569e 03-Dec-2018 Vedant Kumar <vsk@apple.com>

[CodeExtractor] Split PHI nodes with incoming values from outlined region (PR39433)

If a PHI node out of extracted region has multiple incoming values from it,
split this PHI on two parts. First PHI

[CodeExtractor] Split PHI nodes with incoming values from outlined region (PR39433)

If a PHI node out of extracted region has multiple incoming values from it,
split this PHI on two parts. First PHI has incomings only from region and
extracts with it (they are placed to the separate basic block that added to the
list of outlined), and incoming values in original PHI are replaced by first
PHI. Similar solution is already used in CodeExtractor for PHIs in entry block
(severSplitPHINodes method). It covers PR39433 bug.

Patch by Sergei Kachkov!

Differential Revision: https://reviews.llvm.org/D55018

llvm-svn: 348205

show more ...


Revision tags: llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1
# c2990068 24-Oct-2018 Vedant Kumar <vsk@apple.com>

[HotColdSplitting] Identify larger cold regions using domtree queries

The current splitting algorithm works in three stages:

1) Identify cold blocks, then
2) Use forward/backward propagation to

[HotColdSplitting] Identify larger cold regions using domtree queries

The current splitting algorithm works in three stages:

1) Identify cold blocks, then
2) Use forward/backward propagation to mark hot blocks, then
3) Grow a SESE region of blocks *outside* of the set of hot blocks and
start outlining.

While testing this pass on Apple internal frameworks I noticed that some
kinds of control flow (e.g. loops) are never outlined, even though they
unconditionally lead to / follow cold blocks. I noticed two other issues
related to how cold regions are identified:

- An inconsistency can arise in the internal state of the hotness
propagation stage, as a block may end up in both the ColdBlocks set
and the HotBlocks set. Further inconsistencies can arise as these sets
do not match what's in ProfileSummaryInfo.

- It isn't necessary to limit outlining to single-exit regions.

This patch teaches the splitting algorithm to identify maximal cold
regions and outline them. A maximal cold region is defined as the set of
blocks post-dominated by a cold sink block, or dominated by that sink
block. This approach can successfully outline loops in the cold path. As
a side benefit, it maintains less internal state than the current
approach.

Due to a limitation in CodeExtractor, blocks within the maximal cold
region which aren't dominated by a single entry point (a so-called "max
ancestor") are filtered out.

Results:
- X86 (LNT + -Os + externals): 134KB of TEXT were outlined compared to
47KB pre-patch, or a ~3x improvement. Did not see a performance impact
across two runs.
- AArch64 (LNT + -Os + externals + Apple-internal benchmarks): 149KB
of TEXT were outlined. Ditto re: performance impact.
- Outlining results improve marginally in the internal frameworks I
tested.

Follow-ups:
- Outline more than once per function, outline large single basic
blocks, & try to remove unconditional branches in outlined functions.

Differential Revision: https://reviews.llvm.org/D53627

llvm-svn: 345209

show more ...


Revision tags: llvmorg-7.0.0, llvmorg-7.0.0-rc3
# 8267b333 03-Sep-2018 Nico Weber <nicolasweber@gmx.de>

Rename a few unittests/.../Foo.cpp files to FooTest.cpp

The convention for unit test sources is that they're called FooTest.cpp.

No behavior change.
https://reviews.llvm.org/D51579

llvm-svn: 341313