Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2 |
|
#
a1e1dcab |
| 16-Dec-2020 |
diggerlin <digger.llvm@gmail.com> |
[XCOFF][AIX] Emit EH information in traceback table
SUMMARY:
In order for the runtime on AIX to find the compact unwind section(EHInfo table), we would need to set the following on the traceback ta
[XCOFF][AIX] Emit EH information in traceback table
SUMMARY:
In order for the runtime on AIX to find the compact unwind section(EHInfo table), we would need to set the following on the traceback table:
The 6th byte's longtbtable field to true to signal there is an Extended TB Table Flag. The Extended TB Table Flag to be 0x08 to signal there is an exception handling info presents. Emit the offset between ehinfo TC entry and TOC base after all other optional portions of traceback table.
The patch is authored by Jason Liu.
Reviewers: David Tenty, Digger Lin Differential Revision: https://reviews.llvm.org/D92766
show more ...
|
#
b6b522c4 |
| 12-Dec-2020 |
Zequan Wu <zequanwu@google.com> |
[NFC] cleanup cg-profile emission on TargetLowerinng
Differential Revision: https://reviews.llvm.org/D93150
|
#
705a4c14 |
| 08-Dec-2020 |
Hongtao Yu <hoy@fb.com> |
[CSSPGO] Pseudo probe encoding and emission.
This change implements pseudo probe encoding and emission for CSSPGO. Please see RFC here for more context: https://groups.google.com/g/llvm-dev/c/1p1rdY
[CSSPGO] Pseudo probe encoding and emission.
This change implements pseudo probe encoding and emission for CSSPGO. Please see RFC here for more context: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s
Pseudo probes are in the form of intrinsic calls on IR/MIR but they do not turn into any machine instructions. Instead they are emitted into the binary as a piece of data in standalone sections. The probe-specific sections are not needed to be loaded into memory at execution time, thus they do not incur a runtime overhead.
**ELF object emission**
The binary data to emit are organized as two ELF sections, i.e, the `.pseudo_probe_desc` section and the `.pseudo_probe` section. The `.pseudo_probe_desc` section stores a function descriptor for each function and the `.pseudo_probe` section stores the actual probes, each fo which corresponds to an IR basic block or an IR function callsite. A function descriptor is stored as a module-level metadata during the compilation and is serialized into the object file during object emission.
Both the probe descriptors and pseudo probes can be emitted into a separate ELF section per function to leverage the linker for deduplication. A `.pseudo_probe` section shares the same COMDAT group with the function code so that when the function is dead, the probes are dead and disposed too. On the contrary, a `.pseudo_probe_desc` section has its own COMDAT group. This is because even if a function is dead, its probes may be inlined into other functions and its descriptor is still needed by the profile generation tool.
The format of `.pseudo_probe_desc` section looks like:
``` .section .pseudo_probe_desc,"",@progbits .quad 6309742469962978389 // Func GUID .quad 4294967295 // Func Hash .byte 9 // Length of func name .ascii "_Z5funcAi" // Func name .quad 7102633082150537521 .quad 138828622701 .byte 12 .ascii "_Z8funcLeafi" .quad 446061515086924981 .quad 4294967295 .byte 9 .ascii "_Z5funcBi" .quad -2016976694713209516 .quad 72617220756 .byte 7 .ascii "_Z3fibi" ```
For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format :
``` FUNCTION BODY (one for each outlined function present in the text section) GUID (uint64) GUID of the function NPROBES (ULEB128) Number of probes originating from this function. NUM_INLINED_FUNCTIONS (ULEB128) Number of callees inlined into this function, aka number of first-level inlinees PROBE RECORDS A list of NPROBES entries. Each entry contains: INDEX (ULEB128) TYPE (uint4) 0 - block probe, 1 - indirect call, 2 - direct call ATTRIBUTE (uint3) reserved ADDRESS_TYPE (uint1) 0 - code address, 1 - address delta CODE_ADDRESS (uint64 or ULEB128) code address or address delta, depending on ADDRESS_TYPE INLINED FUNCTION RECORDS A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined callees. Each record contains: INLINE SITE GUID of the inlinee (uint64) ID of the callsite probe (ULEB128) FUNCTION BODY A FUNCTION BODY entry describing the inlined function. ```
To support building a context-sensitive profile, probes from inlinees are grouped by their inline contexts. An inline context is logically a call path through which a callee function lands in a caller function. The probe emitter builds an inline tree based on the debug metadata for each outlined function in the form of a trie tree. A tree root is the outlined function. Each tree edge stands for a callsite where inlining happens. Pseudo probes originating from an inlinee function are stored in a tree node and the tree path starting from the root all the way down to the tree node is the inline context of the probes. The emission happens on the whole tree top-down recursively. Probes of a tree node will be emitted altogether with their direct parent edge. Since a pseudo probe corresponds to a real code address, for size savings, the address is encoded as a delta from the previous probe except for the first probe. Variant-sized integer encoding, aka LEB128, is used for address delta and probe index.
**Assembling**
Pseudo probes can be printed as assembly directives alternatively. This allows for good assembly code readability and also provides a view of how optimizations and pseudo probes affect each other, especially helpful for diff time assembly analysis.
A pseudo probe directive has the following operands in order: function GUID, probe index, probe type, probe attributes and inline context. The directive is generated by the compiler and can be parsed by the assembler to form an encoded `.pseudoprobe` section in the object file.
A example assembly looks like:
``` foo2: # @foo2 # %bb.0: # %bb0 pushq %rax testl %edi, %edi .pseudoprobe 837061429793323041 1 0 0 je .LBB1_1 # %bb.2: # %bb2 .pseudoprobe 837061429793323041 6 2 0 callq foo .pseudoprobe 837061429793323041 3 0 0 .pseudoprobe 837061429793323041 4 0 0 popq %rax retq .LBB1_1: # %bb1 .pseudoprobe 837061429793323041 5 1 0 callq *%rsi .pseudoprobe 837061429793323041 2 0 0 .pseudoprobe 837061429793323041 4 0 0 popq %rax retq # -- End function .section .pseudo_probe_desc,"",@progbits .quad 6699318081062747564 .quad 72617220756 .byte 3 .ascii "foo" .quad 837061429793323041 .quad 281547593931412 .byte 4 .ascii "foo2" ```
With inlining turned on, the assembly may look different around %bb2 with an inlined probe:
``` # %bb.2: # %bb2 .pseudoprobe 837061429793323041 3 0 .pseudoprobe 6699318081062747564 1 0 @ 837061429793323041:6 .pseudoprobe 837061429793323041 4 0 popq %rax retq ```
**Disassembling**
We have a disassembling tool (llvm-profgen) that can display disassembly alongside with pseudo probes. So far it only supports ELF executable file.
An example disassembly looks like:
``` 00000000002011a0 <foo2>: 2011a0: 50 push rax 2011a1: 85 ff test edi,edi [Probe]: FUNC: foo2 Index: 1 Type: Block 2011a3: 74 02 je 2011a7 <foo2+0x7> [Probe]: FUNC: foo2 Index: 3 Type: Block [Probe]: FUNC: foo2 Index: 4 Type: Block [Probe]: FUNC: foo Index: 1 Type: Block Inlined: @ foo2:6 2011a5: 58 pop rax 2011a6: c3 ret [Probe]: FUNC: foo2 Index: 2 Type: Block 2011a7: bf 01 00 00 00 mov edi,0x1 [Probe]: FUNC: foo2 Index: 5 Type: IndirectCall 2011ac: ff d6 call rsi [Probe]: FUNC: foo2 Index: 4 Type: Block 2011ae: 58 pop rax 2011af: c3 ret ```
Reviewed By: wmi
Differential Revision: https://reviews.llvm.org/D91878
show more ...
|
#
7ead5f5a |
| 10-Dec-2020 |
Mitch Phillips <31459023+hctim@users.noreply.github.com> |
Revert "[CSSPGO] Pseudo probe encoding and emission."
This reverts commit b035513c06d1cba2bae8f3e88798334e877523e1.
Reason: Broke the ASan buildbots: http://lab.llvm.org:8011/#/builders/5/builds/
Revert "[CSSPGO] Pseudo probe encoding and emission."
This reverts commit b035513c06d1cba2bae8f3e88798334e877523e1.
Reason: Broke the ASan buildbots: http://lab.llvm.org:8011/#/builders/5/builds/2269
show more ...
|
#
b035513c |
| 08-Dec-2020 |
Hongtao Yu <hoy@fb.com> |
[CSSPGO] Pseudo probe encoding and emission.
This change implements pseudo probe encoding and emission for CSSPGO. Please see RFC here for more context: https://groups.google.com/g/llvm-dev/c/1p1rdY
[CSSPGO] Pseudo probe encoding and emission.
This change implements pseudo probe encoding and emission for CSSPGO. Please see RFC here for more context: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s
Pseudo probes are in the form of intrinsic calls on IR/MIR but they do not turn into any machine instructions. Instead they are emitted into the binary as a piece of data in standalone sections. The probe-specific sections are not needed to be loaded into memory at execution time, thus they do not incur a runtime overhead.
**ELF object emission**
The binary data to emit are organized as two ELF sections, i.e, the `.pseudo_probe_desc` section and the `.pseudo_probe` section. The `.pseudo_probe_desc` section stores a function descriptor for each function and the `.pseudo_probe` section stores the actual probes, each fo which corresponds to an IR basic block or an IR function callsite. A function descriptor is stored as a module-level metadata during the compilation and is serialized into the object file during object emission.
Both the probe descriptors and pseudo probes can be emitted into a separate ELF section per function to leverage the linker for deduplication. A `.pseudo_probe` section shares the same COMDAT group with the function code so that when the function is dead, the probes are dead and disposed too. On the contrary, a `.pseudo_probe_desc` section has its own COMDAT group. This is because even if a function is dead, its probes may be inlined into other functions and its descriptor is still needed by the profile generation tool.
The format of `.pseudo_probe_desc` section looks like:
``` .section .pseudo_probe_desc,"",@progbits .quad 6309742469962978389 // Func GUID .quad 4294967295 // Func Hash .byte 9 // Length of func name .ascii "_Z5funcAi" // Func name .quad 7102633082150537521 .quad 138828622701 .byte 12 .ascii "_Z8funcLeafi" .quad 446061515086924981 .quad 4294967295 .byte 9 .ascii "_Z5funcBi" .quad -2016976694713209516 .quad 72617220756 .byte 7 .ascii "_Z3fibi" ```
For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format :
``` FUNCTION BODY (one for each outlined function present in the text section) GUID (uint64) GUID of the function NPROBES (ULEB128) Number of probes originating from this function. NUM_INLINED_FUNCTIONS (ULEB128) Number of callees inlined into this function, aka number of first-level inlinees PROBE RECORDS A list of NPROBES entries. Each entry contains: INDEX (ULEB128) TYPE (uint4) 0 - block probe, 1 - indirect call, 2 - direct call ATTRIBUTE (uint3) reserved ADDRESS_TYPE (uint1) 0 - code address, 1 - address delta CODE_ADDRESS (uint64 or ULEB128) code address or address delta, depending on ADDRESS_TYPE INLINED FUNCTION RECORDS A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined callees. Each record contains: INLINE SITE GUID of the inlinee (uint64) ID of the callsite probe (ULEB128) FUNCTION BODY A FUNCTION BODY entry describing the inlined function. ```
To support building a context-sensitive profile, probes from inlinees are grouped by their inline contexts. An inline context is logically a call path through which a callee function lands in a caller function. The probe emitter builds an inline tree based on the debug metadata for each outlined function in the form of a trie tree. A tree root is the outlined function. Each tree edge stands for a callsite where inlining happens. Pseudo probes originating from an inlinee function are stored in a tree node and the tree path starting from the root all the way down to the tree node is the inline context of the probes. The emission happens on the whole tree top-down recursively. Probes of a tree node will be emitted altogether with their direct parent edge. Since a pseudo probe corresponds to a real code address, for size savings, the address is encoded as a delta from the previous probe except for the first probe. Variant-sized integer encoding, aka LEB128, is used for address delta and probe index.
**Assembling**
Pseudo probes can be printed as assembly directives alternatively. This allows for good assembly code readability and also provides a view of how optimizations and pseudo probes affect each other, especially helpful for diff time assembly analysis.
A pseudo probe directive has the following operands in order: function GUID, probe index, probe type, probe attributes and inline context. The directive is generated by the compiler and can be parsed by the assembler to form an encoded `.pseudoprobe` section in the object file.
A example assembly looks like:
``` foo2: # @foo2 # %bb.0: # %bb0 pushq %rax testl %edi, %edi .pseudoprobe 837061429793323041 1 0 0 je .LBB1_1 # %bb.2: # %bb2 .pseudoprobe 837061429793323041 6 2 0 callq foo .pseudoprobe 837061429793323041 3 0 0 .pseudoprobe 837061429793323041 4 0 0 popq %rax retq .LBB1_1: # %bb1 .pseudoprobe 837061429793323041 5 1 0 callq *%rsi .pseudoprobe 837061429793323041 2 0 0 .pseudoprobe 837061429793323041 4 0 0 popq %rax retq # -- End function .section .pseudo_probe_desc,"",@progbits .quad 6699318081062747564 .quad 72617220756 .byte 3 .ascii "foo" .quad 837061429793323041 .quad 281547593931412 .byte 4 .ascii "foo2" ```
With inlining turned on, the assembly may look different around %bb2 with an inlined probe:
``` # %bb.2: # %bb2 .pseudoprobe 837061429793323041 3 0 .pseudoprobe 6699318081062747564 1 0 @ 837061429793323041:6 .pseudoprobe 837061429793323041 4 0 popq %rax retq ```
**Disassembling**
We have a disassembling tool (llvm-profgen) that can display disassembly alongside with pseudo probes. So far it only supports ELF executable file.
An example disassembly looks like:
``` 00000000002011a0 <foo2>: 2011a0: 50 push rax 2011a1: 85 ff test edi,edi [Probe]: FUNC: foo2 Index: 1 Type: Block 2011a3: 74 02 je 2011a7 <foo2+0x7> [Probe]: FUNC: foo2 Index: 3 Type: Block [Probe]: FUNC: foo2 Index: 4 Type: Block [Probe]: FUNC: foo Index: 1 Type: Block Inlined: @ foo2:6 2011a5: 58 pop rax 2011a6: c3 ret [Probe]: FUNC: foo2 Index: 2 Type: Block 2011a7: bf 01 00 00 00 mov edi,0x1 [Probe]: FUNC: foo2 Index: 5 Type: IndirectCall 2011ac: ff d6 call rsi [Probe]: FUNC: foo2 Index: 4 Type: Block 2011ae: 58 pop rax 2011af: c3 ret ```
Reviewed By: wmi
Differential Revision: https://reviews.llvm.org/D91878
show more ...
|
#
7af80299 |
| 08-Dec-2020 |
Pan, Tao <tao.pan@intel.com> |
[CodeGen] Add text section prefix for COFF object file
Text section prefix is created in CodeGenPrepare, it's file format independent implementation, text section name is written into object file i
[CodeGen] Add text section prefix for COFF object file
Text section prefix is created in CodeGenPrepare, it's file format independent implementation, text section name is written into object file in TargetLoweringObjectFile, it's file format dependent implementation, port code of adding text section prefix to text section name from ELF to COFF. Different with ELF that use '.' as concatenation character, COFF use '$' as concatenation character. That is, concatenation character is variable, so split concatenation character from text section prefix. Text section prefix is existing feature of ELF, it can help to reduce icache and itlb misses, it's also make possible aggregate other compilers e.g. v8 created same prefix sections. Furthermore, the recent feature Machine Function Splitter (basic block level text prefix section) is based on text section prefix.
Reviewed By: pengfei, rnk
Differential Revision: https://reviews.llvm.org/D92073
show more ...
|
#
a65d8c5d |
| 02-Dec-2020 |
jasonliu <jasonliu.development@gmail.com> |
[XCOFF][AIX] Generate LSDA data and compact unwind section on AIX
Summary: AIX uses the existing EH infrastructure in clang and llvm. The major differences would be 1. AIX do not have CFI instructio
[XCOFF][AIX] Generate LSDA data and compact unwind section on AIX
Summary: AIX uses the existing EH infrastructure in clang and llvm. The major differences would be 1. AIX do not have CFI instructions. 2. AIX uses a new personality routine, named __xlcxx_personality_v1. It doesn't use the GCC personality rountine, because the interoperability is not there yet on AIX. 3. AIX do not use eh_frame sections. Instead, it would use a eh_info section (compat unwind section) to store the information about personality routine and LSDA data address.
Reviewed By: daltenty, hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D91455
show more ...
|
Revision tags: llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1 |
|
#
a97f6283 |
| 01-Apr-2020 |
Leonard Chan <leonardchan@google.com> |
[llvm][IR] Add dso_local_equivalent Constant
The `dso_local_equivalent` constant is a wrapper for functions that represents a value which is functionally equivalent to the global passed to this. Tha
[llvm][IR] Add dso_local_equivalent Constant
The `dso_local_equivalent` constant is a wrapper for functions that represents a value which is functionally equivalent to the global passed to this. That is, if this accepts a function, calling this constant should have the same effects as calling the function directly. This could be a direct reference to the function, the `@plt` modifier on X86/AArch64, a thunk, or anything that's equivalent to the resolved function as a call target.
When lowered, the returned address must have a constant offset at link time from some other symbol defined within the same binary. The address of this value is also insignificant. The name is leveraged from `dso_local` where use of a function or variable is resolved to a symbol in the same linkage unit.
In this patch: - Addition of `dso_local_equivalent` and handling it - Update Constant::needsRelocation() to strip constant inbound GEPs and take advantage of `dso_local_equivalent` for relative references
This is useful for the [Relative VTables C++ ABI](https://reviews.llvm.org/D72959) which makes vtables readonly. This works by replacing the dynamic relocations for function pointers in them with static relocations that represent the offset between the vtable and virtual functions. If a function is externally defined, `dso_local_equivalent` can be used as a generic wrapper for the function to still allow for this static offset calculation to be done.
See [RFC](http://lists.llvm.org/pipermail/llvm-dev/2020-August/144469.html) for more details.
Differential Revision: https://reviews.llvm.org/D77248
show more ...
|
#
a2233541 |
| 13-Nov-2020 |
Yuanfang Chen <yuanfang.chen@sony.com> |
[CGProfile] allows bitcast in metadata node storing function pointers
For example, during RAUW in IRMover, the `Function` ValueAsMetadata in "CG Profile" could become bitcast.
Reviewed By: tejohns
[CGProfile] allows bitcast in metadata node storing function pointers
For example, during RAUW in IRMover, the `Function` ValueAsMetadata in "CG Profile" could become bitcast.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D88433
show more ...
|
#
42d21093 |
| 09-Nov-2020 |
jasonliu <jasonliu.development@gmail.com> |
[XCOFF] Enable explicit sections on AIX
Implement mechanism to allow explicit sections to be generated on AIX.
Reviewed By: DiggerLin
Differential Revision: https://reviews.llvm.org/D88615
|
#
f77c8a48 |
| 03-Nov-2020 |
Jessica Clarke <jrtc27@jrtc27.com> |
[CodeGen] Fix regression from D83655
Arm EHABI has a null LSDASection as it does its own thing, so we should continue to return null in that case rather than try and cast it.
|
#
ee5d1a04 |
| 02-Nov-2020 |
Fangrui Song <maskray@google.com> |
[AsmPrinter] Split up .gcc_except_table
MC currently produces monolithic .gcc_except_table section. GCC can split up .gcc_except_table:
* if comdat: `.section .gcc_except_table._Z6comdatv,"aG",@pro
[AsmPrinter] Split up .gcc_except_table
MC currently produces monolithic .gcc_except_table section. GCC can split up .gcc_except_table:
* if comdat: `.section .gcc_except_table._Z6comdatv,"aG",@progbits,_Z6comdatv,comdat` * otherwise, if -ffunction-sections: `.section .gcc_except_table._Z3fooi,"a",@progbits`
This ensures that (a) non-prevailing copies are discarded and (b) .gcc_except_table associated to discarded text sections can be discarded by a .gcc_except_table-aware linker (GNU ld, but not gold or LLD)
This patches matches the GCC behavior. If -fno-unique-section-names is specified, we don't append the suffix. If -ffunction-sections is additionally specified, use `.section ...,unique`.
Note, if clang driver communicates that the linker is LLD and we know it is new (11.0.0 or later) we can use SHF_LINK_ORDER to avoid string table costs, at least in the -fno-unique-section-names case. We cannot use it on GNU ld because as of binutils 2.35 it does not support mixed SHF_LINK_ORDER & non-SHF_LINK_ORDER components in an output section https://sourceware.org/bugzilla/show_bug.cgi?id=26256
For RISC-V -mrelax, this patch additionally fixes an assembler-linker interaction problem: because a section is shrinkable, the length of a call-site code range is not a constant. Relocations referencing the associated text section (STT_SECTION) are needed. However, a STB_LOCAL relocation referencing a discarded section group member from outside the group is disallowed by the ELF specification (PR46675):
``` // a.cc inline int comdat() { try { throw 1; } catch (int) { return 1; } return 0; } int main() { return comdat(); }
// b.cc inline int comdat() { try { throw 1; } catch (int) { return 1; } return 0; } int foo() { return comdat(); }
clang++ -target riscv64-linux -c a.cc b.cc -fPIC -mno-relax ld.lld -shared a.o b.o => ld.lld: error: relocation refers to a symbol in a discarded section: ```
-fbasic-block-sections= is similar to RISC-V -mrelax: there are outstanding relocations.
Reviewed By: jrtc27, rahmanl
Differential Revision: https://reviews.llvm.org/D83655
show more ...
|
#
77638a53 |
| 07-Oct-2020 |
Snehasish Kumar <snehasishk@google.com> |
[llvm] Set the default for -bbsections-cold-text-prefix to .text.split.
After using this for a while, we find that it is generally useful to have it set to .text.split. by default, removing the need
[llvm] Set the default for -bbsections-cold-text-prefix to .text.split.
After using this for a while, we find that it is generally useful to have it set to .text.split. by default, removing the need for an additional -mllvm option.
Differential Revision: https://reviews.llvm.org/D88997
show more ...
|
#
78a9e62a |
| 01-Oct-2020 |
jasonliu <jasonliu.development@gmail.com> |
[XCOFF] Enable -fdata-sections on AIX
Summary: Some design decision worth noting about:
I've noticed a recent mailing discussing about why string literal is not affected by -fdata-sections for ELF
[XCOFF] Enable -fdata-sections on AIX
Summary: Some design decision worth noting about:
I've noticed a recent mailing discussing about why string literal is not affected by -fdata-sections for ELF target: http://lists.llvm.org/pipermail/llvm-dev/2020-September/145121.html
But on AIX, our linker could not split the mergeable string like other target. So I think it would make more sense for us to emit separate csect for every mergeable string in -fdata-sections mode, as there might not be other ways for linker to do garbage collection on unused mergeable string.
Reviewed By: daltenty, hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D88339
show more ...
|
#
6c91e623 |
| 29-Sep-2020 |
Zequan Wu <zequanwu@google.com> |
[CodeGen] emit CG profile for COFF object file
Differential Revision: https://reviews.llvm.org/D87811
|
#
a2578e92 |
| 28-Sep-2020 |
Arthur Eubanks <aeubanks@google.com> |
Revert "Reland [CodeGen] emit CG profile for COFF object file"
This reverts commit 506b6170cb513f1cb6e93a3b690c758f9ded18ac.
This still causes link errors, see https://crbug.com/1130780.
|
#
d2696dec |
| 17-Sep-2020 |
Snehasish Kumar <snehasishk@google.com> |
[llvm] Add -bbsections-cold-text-prefix to emit cold clusters to a different section.
This change adds an option to basic block sections to allow cold clusters to be assigned a custom text prefix. W
[llvm] Add -bbsections-cold-text-prefix to emit cold clusters to a different section.
This change adds an option to basic block sections to allow cold clusters to be assigned a custom text prefix. With a custom prefix such as ".text.split." (D87840), lld can place them in a separate output section. The benefits are -
* Empirically shown to improve icache and itlb metrics by 3-5% (absolute) compared to placing split parts in .text.unlikely. * Mitigates against poor profiles, eg samplePGO profiles used with the machine function splitter. Optimizations such as hugepage remapping can make different decisions at the section granularity. * Enables section granularity hotness monitoring (checking on the decisions made during compilation vs sample data from production).
Differential Revision: https://reviews.llvm.org/D87813
show more ...
|
#
506b6170 |
| 23-Sep-2020 |
Zequan Wu <zequanwu@google.com> |
Reland [CodeGen] emit CG profile for COFF object file
This reverts commit 90242caca2074dab5a9b76e5bc36d9fafd2179a7.
Error fixed at f5435399e823746bbe1737b95c853d77a42e1ac3
Differential Revision: h
Reland [CodeGen] emit CG profile for COFF object file
This reverts commit 90242caca2074dab5a9b76e5bc36d9fafd2179a7.
Error fixed at f5435399e823746bbe1737b95c853d77a42e1ac3
Differential Revision: https://reviews.llvm.org/D87811
show more ...
|
#
90242cac |
| 22-Sep-2020 |
Reid Kleckner <rnk@google.com> |
Revert "[CodeGen] emit CG profile for COFF object file"
This reverts commit 91aed9bf975f1e4346cc8f4bdefc98436386ced2, it is causing link errors.
|
#
9932561b |
| 18-Sep-2020 |
Reid Kleckner <rnk@google.com> |
[COFF] Move per-global .drective emission from AsmPrinter to TLOFCOFF
This changes the order of output sections and the output assembly, but is otherwise NFC.
It simplifies the TLOF interface by re
[COFF] Move per-global .drective emission from AsmPrinter to TLOFCOFF
This changes the order of output sections and the output assembly, but is otherwise NFC.
It simplifies the TLOF interface by removing two COFF-only methods.
show more ...
|
#
91aed9bf |
| 17-Sep-2020 |
Zequan Wu <zequanwu@google.com> |
[CodeGen] emit CG profile for COFF object file
I forgot to add emission of CG profile for COFF object file, when adding the support (https://reviews.llvm.org/D81775)
Differential Revision: https://
[CodeGen] emit CG profile for COFF object file
I forgot to add emission of CG profile for COFF object file, when adding the support (https://reviews.llvm.org/D81775)
Differential Revision: https://reviews.llvm.org/D87811
show more ...
|
#
82d07497 |
| 25-Aug-2020 |
Fangrui Song <maskray@google.com> |
[TargetLoweringObjectFileImpl] Make .llvmbc and .llvmcmd non-SHF_ALLOC
There are two ways .llvmbc can be produced:
* clang -c -fembed-bitcode=all (which also produces .llvmcmd) * LTO backend: ld.ll
[TargetLoweringObjectFileImpl] Make .llvmbc and .llvmcmd non-SHF_ALLOC
There are two ways .llvmbc can be produced:
* clang -c -fembed-bitcode=all (which also produces .llvmcmd) * LTO backend: ld.lld -mllvm -lto-embed-bitcode or -plugin-opt=-lto-embed-bitcode
.llvmbc and .llvmcmd have the SHF_ALLOC flag, so they can be dropped by --gc-sections.
This patch sets SectionKind::Metadata to drop the SHF_ALLOC flag. This is conceptually correct: the two sections are not part of the process image, so SHF_ALLOC is not appropriate.
`test/LTO/X86/embed-bitcode.ll`: changed `llvm-objcopy -O binary --only-section` to `llvm-objcopy --dump-section`. `-O binary` does not dump non-SHF_ALLOC sections.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D86374
show more ...
|
#
e9ac1495 |
| 11-Aug-2020 |
diggerlin <digger.llvm@gmail.com> |
[AIX][XCOFF] change the operand of branch instruction from symbol name to qualified symbol name for function declarations
SUMMARY:
1. in the patch , remove setting storageclass in function .getXCO
[AIX][XCOFF] change the operand of branch instruction from symbol name to qualified symbol name for function declarations
SUMMARY:
1. in the patch , remove setting storageclass in function .getXCOFFSection and construct function of class MCSectionXCOFF there are
XCOFF::StorageMappingClass MappingClass; XCOFF::SymbolType Type; XCOFF::StorageClass StorageClass; in the MCSectionXCOFF class, these attribute only used in the XCOFFObjectWriter, (asm path do not need the StorageClass)
we need get the value of StorageClass, Type,MappingClass before we invoke the getXCOFFSection every time.
actually , we can get the StorageClass of the MCSectionXCOFF from it's delegated symbol.
2. we also change the oprand of branch instruction from symbol name to qualify symbol name. for example change bl .foo extern .foo to bl .foo[PR] extern .foo[PR]
3. and if there is reference indirect call a function bar. we also add extern .bar[PR]
Reviewers: Jason liu, Xiangling Liao
Differential Revision: https://reviews.llvm.org/D84765
show more ...
|
#
20abff04 |
| 10-Aug-2020 |
jasonliu <jasonliu.development@gmail.com> |
[XCOFF][AIX] Use TE storage mapping class when large code model is enabled
Summary: Use TE SMC instead of TC SMC in large code model mode, so that large code model TOC entries could get placed after
[XCOFF][AIX] Use TE storage mapping class when large code model is enabled
Summary: Use TE SMC instead of TC SMC in large code model mode, so that large code model TOC entries could get placed after all the small code model TOC entries, which reduces the chance of TOC overflow.
Reviewed By: Xiangling_L
Differential Revision: https://reviews.llvm.org/D85455
show more ...
|
#
6ef801aa |
| 15-Jul-2020 |
Xiangling Liao <Xiangling.Liao@ibm.com> |
[AIX] Static init frontend recovery and backend support
On the frontend side, this patch recovers AIX static init implementation to use the linkage type and function names Clang chooses for sinit re
[AIX] Static init frontend recovery and backend support
On the frontend side, this patch recovers AIX static init implementation to use the linkage type and function names Clang chooses for sinit related function.
On the backend side, this patch sets correct linkage and function names on aliases created for sinit/sterm functions.
Differential Revision: https://reviews.llvm.org/D84534
show more ...
|