xref: /llvm-project/clang/docs/HLSL/ExpectedDifferences.rst (revision 02654f7370638889b989b4d776d35c3d47c87cdd)
1===================================
2Expected Differences vs DXC and FXC
3===================================
4
5.. contents::
6   :local:
7
8Introduction
9============
10
11HLSL currently has two reference compilers, the `DirectX Shader Compiler (DXC)
12<https://github.com/microsoft/DirectXShaderCompiler/>`_ and the
13`Effect-Compiler (FXC) <https://learn.microsoft.com/en-us/windows/win32/direct3dtools/fxc>`_.
14The two reference compilers do not fully agree. Some known disagreements in the
15references are tracked on
16`DXC's GitHub
17<https://github.com/microsoft/DirectXShaderCompiler/issues?q=is%3Aopen+is%3Aissue+label%3Afxc-disagrees>`_,
18but many more are known to exist.
19
20HLSL as implemented by Clang will also not fully match either of the reference
21implementations, it is instead being written to match the `draft language
22specification <https://microsoft.github.io/hlsl-specs/specs/hlsl.pdf>`_.
23
24This document is a non-exhaustive collection the known differences between
25Clang's implementation of HLSL and the existing reference compilers.
26
27General Principles
28------------------
29
30Most of the intended differences between Clang and the earlier reference
31compilers are focused on increased consistency and correctness. Both reference
32compilers do not always apply language rules the same in all contexts.
33
34Clang also deviates from the reference compilers by providing different
35diagnostics, both in terms of the textual messages and the contexts in which
36diagnostics are produced. While striving for a high level of source
37compatibility with conforming HLSL code, Clang may produce earlier and more
38robust diagnostics for incorrect code or reject code that a reference compiler
39incorrectly accepted.
40
41Language Version
42================
43
44Clang targets language compatibility for HLSL 2021 as implemented by DXC.
45Language features that were removed in earlier versions of HLSL may be added on
46a case-by-case basis, but are not planned for the initial implementation.
47
48Overload Resolution
49===================
50
51Clang's HLSL implementation adopts C++ overload resolution rules as proposed for
52HLSL 202x based on proposal
53`0007 <https://github.com/microsoft/hlsl-specs/blob/main/proposals/0007-const-instance-methods.md>`_
54and
55`0008 <https://github.com/microsoft/hlsl-specs/blob/main/proposals/0008-non-member-operator-overloading.md>`_.
56
57The largest difference between Clang and DXC's overload resolution is the
58algorithm used for identifying best-match overloads. There are more details
59about the algorithmic differences in the :ref:`multi_argument_overloads` section
60below. There are three high level differences that should be highlighted:
61
62* **There should be no cases** where DXC and Clang both successfully
63  resolve an overload where the resolved overload is different between the two.
64* There are cases where Clang will successfully resolve an overload that DXC
65  wouldn't because we've trimmed the overload set in Clang to remove ambiguity.
66* There are cases where DXC will successfully resolve an overload that Clang
67  will not for two reasons: (1) DXC only generates partial overload sets for
68  builtin functions and (2) DXC resolves cases that probably should be ambiguous.
69
70Clang's implementation extends standard overload resolution rules to HLSL
71library functionality. This causes subtle changes in overload resolution
72behavior between Clang and DXC. Some examples include:
73
74.. code-block:: c++
75
76  void halfOrInt16(half H);
77  void halfOrInt16(uint16_t U);
78  void halfOrInt16(int16_t I);
79
80  void takesDoubles(double, double, double);
81
82  cbuffer CB {
83    bool B;
84    uint U;
85    int I;
86    float X, Y, Z;
87    double3 R, G;
88  }
89
90  void takesSingleDouble(double);
91  void takesSingleDouble(vector<double, 1>);
92
93  void scalarOrVector(double);
94  void scalarOrVector(vector<double, 2>);
95
96  export void call() {
97    half H;
98    halfOrInt16(I); // All: Resolves to halfOrInt16(int16_t).
99
100  #ifndef IGNORE_ERRORS
101    halfOrInt16(U); // All: Fails with call ambiguous between int16_t and uint16_t
102                    // overloads
103
104    // asfloat16 is a builtin with overloads for half, int16_t, and uint16_t.
105    H = asfloat16(I); // DXC: Fails to resolve overload for int.
106                      // Clang: Resolves to asfloat16(int16_t).
107    H = asfloat16(U); // DXC: Fails to resolve overload for int.
108                      // Clang: Resolves to asfloat16(uint16_t).
109  #endif
110    H = asfloat16(0x01); // DXC: Resolves to asfloat16(half).
111                         // Clang: Resolves to asfloat16(uint16_t).
112
113    takesDoubles(X, Y, Z); // Works on all compilers
114  #ifndef IGNORE_ERRORS
115    fma(X, Y, Z); // DXC: Fails to resolve no known conversion from float to
116                  //   double.
117                  // Clang: Resolves to fma(double,double,double).
118
119    double D = dot(R, G); // DXC: Resolves to dot(double3, double3), fails DXIL Validation.
120                          // FXC: Expands to compute double dot product with fmul/fadd
121                          // Clang: Fails to resolve as ambiguous against
122                          //   dot(half, half) or dot(float, float)
123  #endif
124
125  #ifndef IGNORE_ERRORS
126    tan(B); // DXC: resolves to tan(float).
127            // Clang: Fails to resolve, ambiguous between integer types.
128
129  #endif
130
131    double D;
132    takesSingleDouble(D); // All: Fails to resolve ambiguous conversions.
133    takesSingleDouble(R); // All: Fails to resolve ambiguous conversions.
134
135    scalarOrVector(D); // All: Resolves to scalarOrVector(double).
136    scalarOrVector(R); // All: Fails to resolve ambiguous conversions.
137  }
138
139.. note::
140
141  In Clang, a conscious decision was made to exclude the ``dot(vector<double,N>, vector<double,N>)``
142  overload and allow overload resolution to resolve the
143  ``vector<float,N>`` overload. This approach provides ``-Wconversion``
144  diagnostic notifying the user of the conversion rather than silently altering
145  precision relative to the other overloads (as FXC does) or generating code
146  that will fail validation (as DXC does).
147
148.. _multi_argument_overloads:
149
150Multi-Argument Overloads
151------------------------
152
153In addition to the differences in single-element conversions, Clang and DXC
154differ dramatically in multi-argument overload resolution. C++ multi-argument
155overload resolution behavior (or something very similar) is required to
156implement
157`non-member operator overloading <https://github.com/microsoft/hlsl-specs/blob/main/proposals/0008-non-member-operator-overloading.md>`_.
158
159Clang adopts the C++ inspired language from the
160`draft HLSL specification <https://microsoft.github.io/hlsl-specs/specs/hlsl.pdf>`_,
161where an overload ``f1`` is a better candidate than ``f2`` if for all arguments the
162conversion sequences is not worse than the corresponding conversion sequence and
163for at least one argument it is better.
164
165.. code-block:: c++
166
167  cbuffer CB {
168    int I;
169    float X;
170    float4 V;
171  }
172
173  void twoParams(int, int);
174  void twoParams(float, float);
175  void threeParams(float, float, float);
176  void threeParams(float4, float4, float4);
177
178  export void call() {
179    twoParams(I, X); // DXC: resolves twoParams(int, int).
180                     // Clang: Fails to resolve ambiguous conversions.
181
182    threeParams(X, V, V); // DXC: resolves threeParams(float4, float4, float4).
183                          // Clang: Fails to resolve ambiguous conversions.
184  }
185
186For the examples above since ``twoParams`` called with mixed parameters produces
187implicit conversion sequences that are { ExactMatch, FloatingIntegral }  and {
188FloatingIntegral, ExactMatch }. In both cases an argument has a worse conversion
189in the other sequence, so the overload is ambiguous.
190
191In the ``threeParams`` example the sequences are { ExactMatch, VectorTruncation,
192VectorTruncation } or { VectorSplat, ExactMatch, ExactMatch }, again in both
193cases at least one parameter has a worse conversion in the other sequence, so
194the overload is ambiguous.
195
196.. note::
197
198  The behavior of DXC documented below is undocumented so this is gleaned from
199  observation and a bit of reading the source.
200
201DXC's approach for determining the best overload produces an integer score value
202for each implicit conversion sequence for each argument expression. Scores for
203casts are based on a bitmask construction that is complicated to reverse
204engineer. It seems that:
205
206* Exact match is 0
207* Dimension increase is 1
208* Promotion is 2
209* Integral -> Float conversion is 4
210* Float -> Integral conversion is 8
211* Cast is 16
212
213The masks are or'd against each other to produce a score for the cast.
214
215The scores of each conversion sequence are then summed to generate a score for
216the overload candidate. The overload candidate with the lowest score is the best
217candidate. If more than one overload are matched for the lowest score the call
218is ambiguous.
219