#
61ea63ba |
| 29-Jan-2025 |
quic-areg <aregmi@quicinc.com> |
[Hexagon] Add support for decoding PLT symbols (#123425)
Describes PLT entries for hexagon.
|
Revision tags: llvmorg-21-init, llvmorg-19.1.7 |
|
#
8b37c1c7 |
| 20-Dec-2024 |
Ikhlas Ajbar <iajbar@quicinc.com> |
[Hexagon] Add V75 support to compiler and assembler (#120773)
This patch introduces support for the Hexagon V75 architecture. It
includes instruction formats, definitions, encodings, scheduling
cl
[Hexagon] Add V75 support to compiler and assembler (#120773)
This patch introduces support for the Hexagon V75 architecture. It
includes instruction formats, definitions, encodings, scheduling
classes, and builtins/intrinsics.
show more ...
|
Revision tags: llvmorg-19.1.6, llvmorg-19.1.5 |
|
#
e9c8106a |
| 20-Nov-2024 |
Kazu Hirata <kazu@google.com> |
[Object] Remove unused includes (NFC) (#116750)
Identified with misc-include-cleaner.
|
Revision tags: llvmorg-19.1.4 |
|
#
a6fc489b |
| 18-Nov-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Add gfx950 subtarget definitions (#116307)
Mostly a stub, but adds some baseline tests and tests for removed instructions.
|
#
de0fd64b |
| 13-Nov-2024 |
Shilei Tian <i@tianshilei.me> |
[AMDGPU] Introduce a new generic target `gfx9-4-generic` (#115190)
This patch introduces a new generic target, `gfx9-4-generic`. Since it doesn’t support FP8 and XF32-related instructions, the patch
[AMDGPU] Introduce a new generic target `gfx9-4-generic` (#115190)
This patch introduces a new generic target, `gfx9-4-generic`. Since it doesn’t support FP8 and XF32-related instructions, the patch includes several code reorganizations to accommodate these changes.
show more ...
|
Revision tags: llvmorg-19.1.3 |
|
#
076aac59 |
| 23-Oct-2024 |
Carl Ritson <carl.ritson@amd.com> |
[AMDGPU] Add a new target for gfx1153 (#113138)
|
Revision tags: llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3 |
|
#
03958680 |
| 07-Aug-2024 |
yonghong-song <yhs@fb.com> |
[BPF] Make llvm-objdump disasm default cpu v4 (#102166)
Currently, with the following example,
$ cat t.c
void foo(int a, _Atomic int *b)
{
*b &= a;
}
$ clang --target=bpf -O2 -c -
[BPF] Make llvm-objdump disasm default cpu v4 (#102166)
Currently, with the following example,
$ cat t.c
void foo(int a, _Atomic int *b)
{
*b &= a;
}
$ clang --target=bpf -O2 -c -mcpu=v3 t.c
$ llvm-objdump -d t.o
t.o: file format elf64-bpf
Disassembly of section .text:
0000000000000000 <foo>:
0: c3 12 00 00 51 00 00 00 <unknown>
1: 95 00 00 00 00 00 00 00 exit
Basically, the default cpu for llvm-objdump is v1 and it won't be able
to decode insn properly.
If we add --mcpu=v3 to llvm-objdump command line, we will have
$ llvm-objdump -d --mcpu=v3 t.o
t.o: file format elf64-bpf
Disassembly of section .text:
0000000000000000 <foo>:
0: c3 12 00 00 51 00 00 00 w1 = atomic_fetch_and((u32 *)(r2 + 0x0), w1)
1: 95 00 00 00 00 00 00 00 exit
The atomic_fetch_and insn can be decoded properly. Using latest cpu
version --mcpu=v4 can also decode properly like the above --mcpu=v3.
To avoid the above '<unknown>' decoding with common 'llvm-objdump -d
t.o', this patch marked the default cpu for llvm-objdump with the
current highest cpu number v4 in ELFObjectFileBase::tryGetCPUName(). The
cpu number in ELFObjectFileBase::tryGetCPUName() will be adjusted in the
future if cpu number is increased e.g. v5 etc. Such an approach also
aligns with gcc-bpf as discussed in [1].
Six bpf unit tests are affected with this change. I changed test output
for three unit tests and added --mcpu=v1 for the other three unit tests,
to demonstrate the default (cpu v4) behavior and explicit --mcpu=v1
behavior.
[1]
https://lore.kernel.org/bpf/6f32c0a1-9de2-4145-92ea-be025362182f@linux.dev/T/#m0f7e63c390bc8f5a5523e7f2f0537becd4205200
Co-authored-by: Yonghong Song <yonghong.song@linux.dev>
show more ...
|
Revision tags: llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init |
|
#
2f37a22f |
| 08-Jul-2024 |
Fangrui Song <i@maskray.me> |
[llvm-objdump] -r: support CREL
Extract the llvm-readelf decoder to `decodeCrel` (#91280) and reuse it for llvm-objdump.
Because the section representation of LLVMObject (`SectionRef`) is 64-bit, i
[llvm-objdump] -r: support CREL
Extract the llvm-readelf decoder to `decodeCrel` (#91280) and reuse it for llvm-objdump.
Because the section representation of LLVMObject (`SectionRef`) is 64-bit, insufficient to hold all decoder states, `section_rel_begin` is modified to decode CREL eagerly and hold the decoded relocations inside ELFObjectFile<ELFT>.
The test is adapted from llvm/test/tools/llvm-readobj/ELF/crel.test.
Pull Request: https://github.com/llvm/llvm-project/pull/97382
show more ...
|
Revision tags: llvmorg-18.1.8 |
|
#
1ca0055f |
| 06-Jun-2024 |
Shilei Tian <i@tianshilei.me> |
[AMDGPU] Add a new target gfx1152 (#94534)
|
Revision tags: llvmorg-18.1.7 |
|
#
775f1cd3 |
| 31-May-2024 |
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com> |
AMDGPU: Add gfx12-generic target (#93875)
|
Revision tags: llvmorg-18.1.6, llvmorg-18.1.5 |
|
#
733a8778 |
| 23-Apr-2024 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Split code that tablegen needs out of RISCVISAInfo. (#89684)
This introduces a new file, RISCVISAUtils.cpp and moves the rest of
RISCVISAInfo to the TargetParser library.
This will allow
[RISCV] Split code that tablegen needs out of RISCVISAInfo. (#89684)
This introduces a new file, RISCVISAUtils.cpp and moves the rest of
RISCVISAInfo to the TargetParser library.
This will allow us to generate part of RISCVISAInfo.cpp using tablegen.
show more ...
|
Revision tags: llvmorg-18.1.4, llvmorg-18.1.3 |
|
#
31f4b329 |
| 19-Mar-2024 |
quic-areg <aregmi@quicinc.com> |
[Hexagon] ELF attributes for Hexagon (#85359)
Defines a subset of attributes and emits them to a section called
.hexagon.attributes.
The current attributes recorded are the attributes needed by
[Hexagon] ELF attributes for Hexagon (#85359)
Defines a subset of attributes and emits them to a section called
.hexagon.attributes.
The current attributes recorded are the attributes needed by
llvm-objdump to automatically determine target features and eliminate
the need to manually pass features.
show more ...
|
Revision tags: llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3 |
|
#
43c7eb5d |
| 14-Feb-2024 |
Pierre van Houtryve <pierre.vanhoutryve@amd.com> |
[AMDGPU] Replace '.' with '-' in generic target names (#81718)
The dot is too confusing for tools. Output temporaries would have
'10.3-generic' so tools could parse it as an extension, device libs
[AMDGPU] Replace '.' with '-' in generic target names (#81718)
The dot is too confusing for tools. Output temporaries would have
'10.3-generic' so tools could parse it as an extension, device libs &
the associated clang driver logic are also confused by the dot.
After discussions, we decided it's better to just remove the '.' from
the target name than fix each issue one by one.
show more ...
|
#
f93aa515 |
| 12-Feb-2024 |
Pierre van Houtryve <pierre.vanhoutryve@amd.com> |
[AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (#76955)
These generic targets include multiple GPUs and will, in the future,
provide a way to build once and run on multiple GPU, at the cost o
[AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (#76955)
These generic targets include multiple GPUs and will, in the future,
provide a way to build once and run on multiple GPU, at the cost of less
optimization opportunities.
Note that this is just doing the compiler side of things, device libs an
runtimes/loader/etc. don't know about these targets yet, so none of them
actually work in practice right now. This is just the initial commit to
make LLVM aware of them.
This contains the documentation changes for both this change and #76954
as well.
show more ...
|
#
8c37e3e6 |
| 07-Feb-2024 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Only set Zca flag for EF_RISCV_RVC in ELFObjectFileBase::getRISCVFeatures(). (#80928)
This code appears to be a hack to set the features to include compressed
instructions if the ELF EFLAGS
[RISCV] Only set Zca flag for EF_RISCV_RVC in ELFObjectFileBase::getRISCVFeatures(). (#80928)
This code appears to be a hack to set the features to include compressed
instructions if the ELF EFLAGS flags bit is present, but the ELF
attribute for the ISA string is no present or not accurate.
We can't remove the hack because llvm-mc doesn't create ELF attributes
by default so a lot of tests fail to disassembler properly. Using clang
as the assembler does set the attributes.
This patch changes the hack to only set Zca since that is the minimum
implied by the flag. Setting anything else potentially conflicts with
the ISA string containing Zcmp or Zcmt.
JITLink also needs to be updated to recognize Zca in addition to C.
show more ...
|
Revision tags: llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1 |
|
#
43ab40a5 |
| 25-Jan-2024 |
Alexandre Ganea <alex_toresh@yahoo.fr> |
[llvm] Silence warning when building with Clang ToT
This fixes: ``` [1343/7452] Building CXX object lib\Object\CMakeFiles\LLVMObject.dir\ELFObjectFile.cpp.obj C:\git\llvm-project\llvm\lib\Object\ELF
[llvm] Silence warning when building with Clang ToT
This fixes: ``` [1343/7452] Building CXX object lib\Object\CMakeFiles\LLVMObject.dir\ELFObjectFile.cpp.obj C:\git\llvm-project\llvm\lib\Object\ELFObjectFile.cpp(808,27): warning: comparison of integers of different signs: 'unsigned int' and '_Iter_diff_t<const Elf_Shdr_Impl<ELFType<llvm::endianness::little, false>> *>' (aka 'int') [-Wsign-compare] 808 | if (*TextSectionIndex != std::distance(Sections.begin(), *TextSecOrErr)) | ~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C:\git\llvm-project\llvm\lib\Object\ELFObjectFile.cpp(913,12): note: in instantiation of function template specialization 'readBBAddrMapImpl<llvm::object::ELFType<llvm::endianness::little, false>>' requested here 913 | return readBBAddrMapImpl(Obj->getELFFile(), TextSectionIndex, PGOAnalyses); | ^ ```
show more ...
|
Revision tags: llvmorg-19-init |
|
#
c0675248 |
| 19-Jan-2024 |
Aiden Grossman <agrossman154@yahoo.com> |
[SHT_LLVM_BB_ADDR_MAP] Add assertion and clarify docstring (#77374)
This patch adds an assertion to readBBAddrMapImpl to confirm that
PGOAnalyses and BBAddrMaps are of the same size when PGO inform
[SHT_LLVM_BB_ADDR_MAP] Add assertion and clarify docstring (#77374)
This patch adds an assertion to readBBAddrMapImpl to confirm that
PGOAnalyses and BBAddrMaps are of the same size when PGO information is
requested (part of the API contract). This patch also updates the
docstring for readBBAddrMap to better clarify what is guaranteed.
show more ...
|
#
db78c30b |
| 09-Jan-2024 |
Luke Lau <luke@igalia.com> |
[RISCV] Deduplicate RISCVISAInfo::toFeatures/toFeatureVector. NFC (#76942)
toFeatures and toFeatureVector both output a list of target feature
flags, just with a slightly different interface. toFea
[RISCV] Deduplicate RISCVISAInfo::toFeatures/toFeatureVector. NFC (#76942)
toFeatures and toFeatureVector both output a list of target feature
flags, just with a slightly different interface. toFeatures keeps any
unsupported extensions, and also provides a way to append negative
extensions (AddAllExtensions=true).
This patch combines them into one function, so that a later patch will
be be able to get a std::vector of features that includes all the
negative extensions, which was previously only possible through the
StrAlloc interface.
show more ...
|
#
deab58d1 |
| 20-Dec-2023 |
Joseph Huber <huberjn@outlook.com> |
[ELF] Add CPU name detection for CUDA architectures (#75964)
Summary: Recently we added support for detecting the CUDA processor with the ELF flags. This allows us to get a string representation of
[ELF] Add CPU name detection for CUDA architectures (#75964)
Summary: Recently we added support for detecting the CUDA processor with the ELF flags. This allows us to get a string representation of it in other code. This will be used by the offloading runtime.
show more ...
|
#
105adf2c |
| 12-Dec-2023 |
Micah Weston <micahsweston@gmail.com> |
[SHT_LLVM_BB_ADDR_MAP] Implements PGOAnalysisMap in Object and ObjectYAML with tests.
Reviewed in PR (#71750). A part of [RFC - PGO Accuracy Metrics: Emitting and Evaluating Branch
and Block
Analy
[SHT_LLVM_BB_ADDR_MAP] Implements PGOAnalysisMap in Object and ObjectYAML with tests.
Reviewed in PR (#71750). A part of [RFC - PGO Accuracy Metrics: Emitting and Evaluating Branch
and Block
Analysis](https://discourse.llvm.org/t/rfc-pgo-accuracy-metrics-emitting-and-evaluating-branch-and-block-analysis/73902).
This PR adds the PGOAnalysisMap data structure and implements encoding and
decoding through Object and ObjectYAML along with associated tests. When
emitted into the bb-addr-map section, each function is followed by the associated
pgo-analysis-map for that function. The emitting of each analysis in the map is
controlled by a bit in the bb-addr-map feature byte. All existing bb-addr-map
code can ignore the pgo-analysis-map if the caller does not request the data.
show more ...
|
Revision tags: llvmorg-17.0.6 |
|
#
cf1e0c0b |
| 23-Nov-2023 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Define new targets gfx1200 and gfx1201 (#73133)
Define target names and ELF numbers for new GFX12 targets gfx1200 and
gfx1201. For now they behave identically to GFX11.
|
Revision tags: llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init |
|
#
92542f2a |
| 17-Jul-2023 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Add targets gfx1150 and gfx1151
This is the target definition only. Currently they are treated the same as GFX 11.0.x.
Differential Revision: https://reviews.llvm.org/D155429
|
#
a4d1259e |
| 13-Jul-2023 |
Fangrui Song <i@maskray.me> |
[llvm-objdump] Default to --mcpu=future for PPC32
Extend D127824 to the 32-bit Power architecture. AFAICT GNU objdump -d dumps all instructions for 32-bit as well.
Reviewed By: #powerpc, nemanjai
[llvm-objdump] Default to --mcpu=future for PPC32
Extend D127824 to the 32-bit Power architecture. AFAICT GNU objdump -d dumps all instructions for 32-bit as well.
Reviewed By: #powerpc, nemanjai
Differential Revision: https://reviews.llvm.org/D155114
show more ...
|
#
8de9f2b5 |
| 26-Jun-2023 |
Job Noorman <jnoorman@igalia.com> |
Move SubtargetFeature.h from MC to TargetParser
SubtargetFeature.h is currently part of MC while it doesn't depend on anything in MC. Since some LLVM components might have the need to work with targ
Move SubtargetFeature.h from MC to TargetParser
SubtargetFeature.h is currently part of MC while it doesn't depend on anything in MC. Since some LLVM components might have the need to work with target features without necessarily needing MC, it might be worthwhile to move SubtargetFeature.h to a different location. This will reduce the dependencies of said components.
Note that I choose TargetParser as the destination because that's where Triple lives and SubtargetFeatures feels related to that.
This issues came up during a JITLink review (D149522). JITLink would like to avoid a dependency on MC while still needing to store target features.
Reviewed By: MaskRay, arsenm
Differential Revision: https://reviews.llvm.org/D150549
show more ...
|
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4 |
|
#
9e37a7bd |
| 16-May-2023 |
Fangrui Song <i@maskray.me> |
[llvm-objdump][X86] Add @plt symbols for .plt.got
If a symbol needs both JUMP_SLOT and GLOB_DAT relocations, there is a minor linker optimization to keep just GLOB_DAT. This optimization is only imp
[llvm-objdump][X86] Add @plt symbols for .plt.got
If a symbol needs both JUMP_SLOT and GLOB_DAT relocations, there is a minor linker optimization to keep just GLOB_DAT. This optimization is only implemented by GNU ld's x86 port and mold. https://maskray.me/blog/2021-08-29-all-about-global-offset-table#combining-.got-and-.got.plt
With the optimizing, the PLT entry is placed in .plt.got and the associated GOTPLT entry is placed in .got (ld.bfd -z now) or .got.plt (ld.bfd -z lazy). The relocation is in .rel[a].dyn.
This patch synthesizes `symbol@plt` labels for these .plt.got entries.
Example: ``` cat > a.s <<e .globl _start; _start: mov combined0@gotpcrel(%rip), %rax; mov combined1@gotpcrel(%rip), %rax call combined0@plt; call combined1@plt call foo0@plt; call foo1@plt e cat > b.s <<e .globl foo0, foo1, combined0, combined1 foo0: foo1: combined0: combined1: e gcc -fuse-ld=bfd -shared b.s -o b.so gcc -fuse-ld=bfd -pie -nostdlib a.s b.so -o a ```
``` Disassembly of section .plt:
0000000000001000 <.plt>: 1000: ff 35 ea 1f 00 00 pushq 0x1fea(%rip) # 0x2ff0 <_GLOBAL_OFFSET_TABLE_+0x8> 1006: ff 25 ec 1f 00 00 jmpq *0x1fec(%rip) # 0x2ff8 <_GLOBAL_OFFSET_TABLE_+0x10> 100c: 0f 1f 40 00 nopl (%rax)
0000000000001010 <foo1@plt>: 1010: ff 25 ea 1f 00 00 jmpq *0x1fea(%rip) # 0x3000 <_GLOBAL_OFFSET_TABLE_+0x18> 1016: 68 00 00 00 00 pushq $0x0 101b: e9 e0 ff ff ff jmp 0x1000 <.plt>
0000000000001020 <foo0@plt>: 1020: ff 25 e2 1f 00 00 jmpq *0x1fe2(%rip) # 0x3008 <_GLOBAL_OFFSET_TABLE_+0x20> 1026: 68 01 00 00 00 pushq $0x1 102b: e9 d0 ff ff ff jmp 0x1000 <.plt>
Disassembly of section .plt.got:
0000000000001030 <combined0@plt>: 1030: ff 25 a2 1f 00 00 jmpq *0x1fa2(%rip) # 0x2fd8 <foo1+0x2fd8> 1036: 66 90 nop
0000000000001038 <combined1@plt>: 1038: ff 25 a2 1f 00 00 jmpq *0x1fa2(%rip) # 0x2fe0 <foo1+0x2fe0> 103e: 66 90 nop ```
For x86-32, with -z now, if we remove `foo0` and `foo1`, the absence of regular PLT will cause GNU ld to omit .got.plt, and our code cannot synthesize @plt labels. This is an extreme corner case that almost never happens in practice (to trigger the case, ensure every PLT symbol has been taken address). To fix it, we can get the `_GLOBAL_OFFSET_TABLE_` symbol value, but the complexity is not worth it.
Close https://github.com/llvm/llvm-project/issues/62537
Reviewed By: bd1976llvm
Differential Revision: https://reviews.llvm.org/D149817
show more ...
|