|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4 |
|
| #
be36812f |
| 21-Feb-2024 |
David Majnemer <david.majnemer@gmail.com> |
[TargetLowering] Be more efficient in fp -> bf16 NaN conversions
We can avoid masking completely as it is OK (and probably preferable) to bring over some of the existant NaN payload.
|
| #
9eff001d |
| 21-Feb-2024 |
David Majnemer <david.majnemer@gmail.com> |
[TargetLowering] Correctly yield NaN from FP_TO_BF16
We didn't set the exponent field, resulting in tiny numbers instead of NaNs.
|
| #
cc13f3ba |
| 21-Feb-2024 |
David Majnemer <david.majnemer@gmail.com> |
Correctly round FP -> BF16 when SDAG expands such nodes (#82399)
We did something pretty naive:
- round FP64 -> BF16 by first rounding to FP32
- skip FP32 -> BF16 rounding entirely
- taking the t
Correctly round FP -> BF16 when SDAG expands such nodes (#82399)
We did something pretty naive:
- round FP64 -> BF16 by first rounding to FP32
- skip FP32 -> BF16 rounding entirely
- taking the top 16 bits of a FP32 which will turn some NaNs into
infinities
Let's do this in a more principled way by rounding types with more
precision than FP32 to FP32 using round-inexact-to-odd which will negate
double rounding issues.
show more ...
|
|
Revision tags: llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init |
|
| #
460ffcdd |
| 04-Jan-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Make bf16/v2bf16 legal types (#76215)
There are some intrinsics are using i16 vectors in place of bfloat
vectors.
Move towards making bf16 vectors legal so these can migrate. Leave the
la
AMDGPU: Make bf16/v2bf16 legal types (#76215)
There are some intrinsics are using i16 vectors in place of bfloat
vectors.
Move towards making bf16 vectors legal so these can migrate. Leave the
larger vectors for a later change.
Depends #76213 #76214
show more ...
|
|
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5 |
|
| #
6b695846 |
| 07-Nov-2023 |
Amara Emerson <amara@apple.com> |
[GlobalISel] Fall back for bf16 conversions. (#71470)
We don't support these correctly since we don't yet have FP types.
AMDGPU tests were silently miscompiling bf16 as if they were fp16.
|
|
Revision tags: llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4 |
|
| #
2f5a116c |
| 07-May-2023 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Expand casted f16 fmed3 pattern to fmin/fmax on gfx8
If we have legal f16 instructions but no f16 med3, we can save one instruction by expanding out the min/max sequence compared to casting
AMDGPU: Expand casted f16 fmed3 pattern to fmin/fmax on gfx8
If we have legal f16 instructions but no f16 med3, we can save one instruction by expanding out the min/max sequence compared to casting to f32 and casting back.
show more ...
|
| #
79707ba0 |
| 07-May-2023 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Add baseline test for gfx8 fptrunc combine
|