X86.cpp - OpenGrok history log for /llvm-project/clang/lib/Basic/Targets/X86.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7
# 8e659401	08-Jan-2025	Alexandros Lamprineas <alexandros.lamprineas@arm.com>	[FMV][AArch64] Simplify version selection according to ACLE. (#121921) Currently, the more features a version has, the higher its priority is. We are changing ACLE https://github.com/ARM-software/a [FMV][AArch64] Simplify version selection according to ACLE. (#121921) Currently, the more features a version has, the higher its priority is. We are changing ACLE https://github.com/ARM-software/acle/pull/370 as follows: "Among any two versions, the higher priority version is determined by identifying the highest priority feature that is specified in exactly one of the versions, and selecting that version." show more ...
# a774adb0	05-Jan-2025	Chandler Carruth <chandlerc@gmail.com>	Bulk port 64-bit x86 builtins to TableGen (#121043) This PR follows https://github.com/llvm/llvm-project/pull/120831 for x86-64. Similar to that PR, this does a very mechanical port of X86 built Bulk port 64-bit x86 builtins to TableGen (#121043) This PR follows https://github.com/llvm/llvm-project/pull/120831 for x86-64. Similar to that PR, this does a very mechanical port of X86 builtins to TableGen. There is a lot of improvement available here to use TableGen more effectively and collapse repeated structures. But those can now be follow-up PRs that restructure within the `.td` file. The current structure produces a file that exactly matches the original X-macros except for the differences outlined in https://github.com/llvm/llvm-project/pull/120831: - Horizontal whitespace - `long long` types now use `long long` outside of OpenCL, but switch to `long` in OpenCL where relevant. Otherwise, only the order of builtins change, and no tests regress. show more ...
# 2529a8df	04-Jan-2025	Chandler Carruth <chandlerc@gmail.com>	Mechanically port bulk of x86 builtins to TableGen (#120831) The goal is to make incremental (if small) progress towards fully TableGen'ed builtins, and to unblock #120534 by gaining access to more Mechanically port bulk of x86 builtins to TableGen (#120831) The goal is to make incremental (if small) progress towards fully TableGen'ed builtins, and to unblock #120534 by gaining access to more powerful TableGen-based representations. The bulk `.td` file addition was generated with the help of a very rough Python script. That script made no attempt to be robust or reusable, it specifically handled only the cases in the X86 `.def` file. Four entries from the `.def` file were not handled automatically as they used `BUILTIN` rather than `TARGET_BUILTIN`. These were ported by hand to an empty-feature `TargetBuiltin` entry, which seems like a better match. For all the automatically ported entries, the results were compared by sorting and diffing the `.def` file and the generated `.inc` file. The only differences were: - Different horizontal whitespace - Additional entries that had already been ported to the `.td` file. - More systematically using `Oi` instead of `LLi` for the type `long long int` in the fully general `__builtin_ia32_...` builtins for OpenCL support. The `.def` file was only partially moved to this it seems, and the systematic migration has updated a few missed builtins. show more ...
Revision tags: llvmorg-19.1.6
# ca79ff07	14-Dec-2024	Chandler Carruth <chandlerc@gmail.com>	Revert "Switch builtin strings to use string tables" (#119638) Reverts llvm/llvm-project#118734 There are currently some specific versions of MSVC that are miscompiling this code (we think). We Revert "Switch builtin strings to use string tables" (#119638) Reverts llvm/llvm-project#118734 There are currently some specific versions of MSVC that are miscompiling this code (we think). We don't know why as all the other build bots and at least some folks' local Windows builds work fine. This is a candidate revert to help the relevant folks catch their builders up and have time to debug the issue. However, the expectation is to roll forward at some point with a workaround if at all possible. show more ...
# be2df95e	09-Dec-2024	Chandler Carruth <chandlerc@gmail.com>	Switch builtin strings to use string tables (#118734) The Clang binary (and any binary linking Clang as a library), when built using PIE, ends up with a pretty shocking number of dynamic relocation Switch builtin strings to use string tables (#118734) The Clang binary (and any binary linking Clang as a library), when built using PIE, ends up with a pretty shocking number of dynamic relocations to apply to the executable image: roughly 400k. Each of these takes up binary space in the executable, and perhaps most interestingly takes start-up time to apply the relocations. The largest pattern I identified were the strings used to describe target builtins. The addresses of these string literals were stored into huge arrays, each one requiring a dynamic relocation. The way to avoid this is to design the target builtins to use a single large table of strings and offsets within the table for the individual strings. This switches the builtin management to such a scheme. This saves over 100k dynamic relocations by my measurement, an over 25% reduction. Just looking at byte size improvements, using the `bloaty` tool to compare a newly built `clang` binary to an old one: ``` FILE SIZE VM SIZE -------------- -------------- +1.4% +653Ki +1.4% +653Ki .rodata +0.0% +960 +0.0% +960 .text +0.0% +197 +0.0% +197 .dynstr +0.0% +184 +0.0% +184 .eh_frame +0.0% +96 +0.0% +96 .dynsym +0.0% +40 +0.0% +40 .eh_frame_hdr +114% +32 [ = ] 0 [Unmapped] +0.0% +20 +0.0% +20 .gnu.hash +0.0% +8 +0.0% +8 .gnu.version +0.9% +7 +0.9% +7 [LOAD #2 [R]] [ = ] 0 -75.4% -3.00Ki .relro_padding -16.1% -802Ki -16.1% -802Ki .data.rel.ro -27.3% -2.52Mi -27.3% -2.52Mi .rela.dyn -1.6% -2.66Mi -1.6% -2.66Mi TOTAL ``` We get a 16% reduction in the `.data.rel.ro` section, and nearly 30% reduction in `.rela.dyn` where those reloctaions are stored. This is also visible in my benchmarking of binary start-up overhead at least: ``` Benchmark 1: ./old_clang --version Time (mean ± σ): 17.6 ms ± 1.5 ms [User: 4.1 ms, System: 13.3 ms] Range (min … max): 14.2 ms … 22.8 ms 162 runs Benchmark 2: ./new_clang --version Time (mean ± σ): 15.5 ms ± 1.4 ms [User: 3.6 ms, System: 11.8 ms] Range (min … max): 12.4 ms … 20.3 ms 216 runs Summary './new_clang --version' ran 1.13 ± 0.14 times faster than './old_clang --version' ``` We get about 2ms faster `--version` runs. While there is a lot of noise in binary execution time, this delta is pretty consistent, and represents over 10% improvement. This is particularly interesting to me because for very short source files, repeatedly starting the `clang` binary is actually the dominant cost. For example, `configure` scripts running against the `clang` compiler are slow in large part because of binary start up time, not the time to process the actual inputs to the compiler. ---- This PR implements the string tables using `constexpr` code and the existing macro system. I understand that the builtins are moving towards a TableGen model, and if complete that would provide more options for modeling this. Unfortunately, that migration isn't complete, and even the parts that are migrated still rely on the ability to break out of the TableGen model and directly expand an X-macro style `BUILTIN(...)` textually. I looked at trying to complete the move to TableGen, but it would both require the difficult migration of the remaining targets, and solving some tricky problems with how to move away from any macro-based expansion. I was also able to find a reasonably clean and effective way of doing this with the existing macros and some `constexpr` code that I think is clean enough to be a pretty good intermediate state, and maybe give a good target for the eventual TableGen solution. I was also able to factor the macros into set of consistent patterns that avoids a significant regression in overall boilerplate. show more ...
# ea6cdb9a	03-Dec-2024	Matthias Braun <matze@braunis.de>	allow prefer 256 bit attribute target (#117092) This allows `__attribute__((target("prefer-256-bit")))` / `__attribute__((target("no-prefer-256-bit")))` to create variants of a functions with 256 allow prefer 256 bit attribute target (#117092) This allows `__attribute__((target("prefer-256-bit")))` / `__attribute__((target("no-prefer-256-bit")))` to create variants of a functions with 256/512 bit vector sizes within the same application. show more ...
Revision tags: llvmorg-19.1.5
# b869f1bd	28-Nov-2024	Jie Fu <jiefu@tencent.com>	[clang] Remove unused lambda capture (NFC) /llvm-project/clang/lib/Basic/Targets/X86.cpp:1368:23: error: lambda capture 'this' is not used [-Werror,-Wunused-lambda-capture] auto getPriority = [thi [clang] Remove unused lambda capture (NFC) /llvm-project/clang/lib/Basic/Targets/X86.cpp:1368:23: error: lambda capture 'this' is not used [-Werror,-Wunused-lambda-capture] auto getPriority = [this](StringRef Feature) -> unsigned { ^~~~ 1 error generated. show more ...
# 88c2af80	28-Nov-2024	Alexandros Lamprineas <alexandros.lamprineas@arm.com>	[NFC][clang][FMV][TargetInfo] Refactor API for FMV feature priority. (#116257) Currently we have code with target hooks in CodeGenModule shared between X86 and AArch64 for sorting MultiVersionResol [NFC][clang][FMV][TargetInfo] Refactor API for FMV feature priority. (#116257) Currently we have code with target hooks in CodeGenModule shared between X86 and AArch64 for sorting MultiVersionResolverOptions. Those are used when generating IFunc resolvers for FMV. The RISCV target has different criteria for sorting, therefore it repeats sorting after calling CodeGenFunction::EmitMultiVersionResolver. I am moving the FMV priority logic in TargetInfo, so that it can be implemented by the TargetParser which then makes it possible to query it from llvm. Here is an example why this is handy: https://github.com/llvm/llvm-project/pull/87939 show more ...
Revision tags: llvmorg-19.1.4
# 97836bed	18-Nov-2024	Freddy Ye <freddy.ye@intel.com>	Reland "[X86] Support -march=diamondrapids (#113881)" (#116564) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
# 90e92239	18-Nov-2024	Freddy Ye <freddy.ye@intel.com>	Revert "[X86] Support -march=diamondrapids (#113881)" (#116563) This reverts commit 826b845c9e97448395431be3e4e5da585bd98c5e.
# 826b845c	18-Nov-2024	Freddy Ye <freddy.ye@intel.com>	[X86] Support -march=diamondrapids (#113881) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
# f77101ea	12-Nov-2024	Malay Sanghi <malay.sanghi@intel.com>	[X86][AMX] Support AMX-MOVRS (#115151) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
# eddb79d5	11-Nov-2024	Feng Zou <feng.zou@intel.com>	[X86][AMX] Support AMX-TF32 (#115625) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
# 8f440137	09-Nov-2024	Phoebe Wang <phoebe.wang@intel.com>	Reland "[X86][AMX] Support AMX-AVX512" (#115581) Resolve compile fail without SSE2.
# ff225154	09-Nov-2024	Alan Zhao <ayzhao@google.com>	Revert "[X86][AMX] Support AMX-AVX512" (#115570) Reverts llvm/llvm-project#114070 Reason: Causes `immintrin.h` to fail to compile if `-msse` and `-mno-sse2` are passed to clang: https://github. Revert "[X86][AMX] Support AMX-AVX512" (#115570) Reverts llvm/llvm-project#114070 Reason: Causes `immintrin.h` to fail to compile if `-msse` and `-mno-sse2` are passed to clang: https://github.com/llvm/llvm-project/pull/114070#issuecomment-2465926700 show more ...
# 58a17e1b	08-Nov-2024	Phoebe Wang <phoebe.wang@intel.com>	[X86][AMX] Support AMX-AVX512 (#114070)
# c72a751d	01-Nov-2024	Phoebe Wang <phoebe.wang@intel.com>	[X86][AMX] Support AMX-TRANSPOSE (#113532) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
# 81271624	31-Oct-2024	Feng Zou <feng.zou@intel.com>	[X86][AMX] Support AMX-FP8 (#113850) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
# 50826382	30-Oct-2024	Nikolas Klauser <nikolasklauser@berlin.de>	[Clang] Start moving X86Builtins.def to X86Builtins.td (#106005) This starts moving `X86Builtins.def` to be a tablegen file. It's quite large, so I think it'd be good to move things in multiple ste [Clang] Start moving X86Builtins.def to X86Builtins.td (#106005) This starts moving `X86Builtins.def` to be a tablegen file. It's quite large, so I think it'd be good to move things in multiple steps to avoid a bunch of merge conflicts due to the amount of time this takes to complete. show more ...
Revision tags: llvmorg-19.1.3
# 7bd8a165	28-Oct-2024	Craig Topper <craig.topper@sifive.com>	[X86] Don't allow '+f' as an inline asm constraint. (#113871) f cannot be used as an output constraint. We already errored for '=f' but not '+f'. Fixes #113692.
# c4248fa3	25-Oct-2024	Freddy Ye <freddy.ye@intel.com>	[X86] Support MOVRS and AVX10.2 instructions. (#113274) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
Revision tags: llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0
# 02e4186d	13-Sep-2024	Ganesh <Ganesh.Gopalasubramanian@amd.com>	[X86] AMD Zen 5 Initial enablement (#107964) This patch enables the basic skeleton enablement of AMD next gen zen5 CPUs.
# 83ad644a	04-Sep-2024	Freddy Ye <freddy.ye@intel.com>	[X86][AVX10.2] Support AVX10.2-BF16 new instructions. (#101603) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
Revision tags: llvmorg-19.1.0-rc4
# 3f25f23a	20-Aug-2024	Phoebe Wang <phoebe.wang@intel.com>	[X86][AVX10] Fix unexpected error and warning when using intrinsic (#104781) E.g.: https://godbolt.org/z/G8zK5svjK Based on Evgenii's work.
Revision tags: llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2
# 259ca9ee	03-Aug-2024	Phoebe Wang <phoebe.wang@intel.com>	Reland "[X86][AVX10.2] Support AVX10.2 option and VMPSADBW/VADDP[D,H,S] new instructions (#101452)" (#101616) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
12 3 4 5 6 7 8 9 10 11