1=================== 2HLSL Function Calls 3=================== 4 5.. contents:: 6 :local: 7 8Introduction 9============ 10 11This document describes the design and implementation of HLSL's function call 12semantics in Clang. This includes details related to argument conversion and 13parameter lifetimes. 14 15This document does not seek to serve as official documentation for HLSL's 16call semantics, but does provide an overview to assist a reader. The 17authoritative documentation for HLSL's language semantics is the `draft language 18specification <https://microsoft.github.io/hlsl-specs/specs/hlsl.pdf>`_. 19 20Argument Semantics 21================== 22 23In HLSL, all function arguments are passed by value in and out of functions. 24HLSL has 3 keywords which denote the parameter semantics (``in``, ``out`` and 25``inout``). In a function declaration a parameter may be annotated any of the 26following ways: 27 28#. <no parameter annotation> - denotes input 29#. ``in`` - denotes input 30#. ``out`` - denotes output 31#. ``in out`` - denotes input and output 32#. ``out in`` - denotes input and output 33#. ``inout`` - denotes input and output 34 35Parameters that are exclusively input behave like C/C++ parameters that are 36passed by value. 37 38For parameters that are output (or input and output), a temporary value is 39created in the caller. The temporary value is then passed by-address. For 40output-only parameters, the temporary is uninitialized when passed (if the 41parameter is not explicitly initialized inside the function an undefined value 42is stored back to the argument expression). For parameters that are both input 43and output, the temporary is initialized from the lvalue argument expression 44through implicit or explicit casting from the lvalue argument type to the 45parameter type. 46 47On return of the function, the values of any parameter temporaries are written 48back to the argument expression through an inverted conversion sequence (if an 49``out`` parameter was not initialized in the function, the uninitialized value 50may be written back). 51 52Parameters of constant-sized array type are also passed with value semantics. 53This requires input parameters of arrays to construct temporaries and the 54temporaries go through array-to-pointer decay when initializing parameters. 55 56Implementations are allowed to avoid unnecessary temporaries, and HLSL's strict 57no-alias rules can enable some trivial optimizations. 58 59Array Temporaries 60----------------- 61 62Given the following example: 63 64.. code-block:: c++ 65 66 void fn(float a[4]) { 67 a[0] = a[1] + a[2] + a[3]; 68 } 69 70 float4 main() : SV_Target { 71 float arr[4] = {1, 1, 1, 1}; 72 fn(arr); 73 return float4(arr[0], arr[1], arr[2], arr[3]); 74 } 75 76In C or C++, the array parameter decays to a pointer, so after the call to 77``fn``, the value of ``arr[0]`` is ``3``. In HLSL, the array is passed by value, 78so modifications inside ``fn`` do not propagate out. 79 80.. note:: 81 82 DXC may pass unsized arrays directly as decayed pointers, which is an 83 unfortunate behavior divergence. 84 85Out Parameter Temporaries 86------------------------- 87 88.. code-block:: c++ 89 90 void Init(inout int X, inout int Y) { 91 Y = 2; 92 X = 1; 93 } 94 95 void main() { 96 int V; 97 Init(V, V); // MSVC (or clang-cl) V == 2, Clang V == 1 98 } 99 100In the above example the ``Init`` function's behavior depends on the C++ 101implementation. C++ does not define the order in which parameters are 102initialized or destroyed. In MSVC and Clang's MSVC compatibility mode, arguments 103are emitted right-to-left and destroyed left-to-right. This means that the 104parameter initialization and destruction occurs in the order: {``Y``, ``X``, 105``~X``, ``~Y``}. This causes the write-back of the value of ``Y`` to occur last, 106so the resulting value of ``V`` is ``2``. In the Itanium C++ ABI, the parameter 107ordering is reversed, so the initialization and destruction occurs in the order: 108{``X``, ``Y``, ``~Y``, ``X``}. This causes the write-back of the value ``X`` to 109occur last, resulting in the value of ``V`` being set to ``1``. 110 111.. code-block:: c++ 112 113 void Trunc(inout int3 V) { } 114 115 116 void main() { 117 float3 F = {1.5, 2.6, 3.3}; 118 Trunc(F); // F == {1.0, 2.0, 3.0} 119 } 120 121In the above example, the argument expression ``F`` undergoes element-wise 122conversion from a float vector to an integer vector to create a temporary 123``int3``. On expiration the temporary undergoes elementwise conversion back to 124the floating point vector type ``float3``. This results in an implicit 125element-wise conversion of the vector even if the value is unused in the 126function (effectively truncating the floating point values). 127 128 129.. code-block:: c++ 130 131 void UB(out int X) {} 132 133 void main() { 134 int X = 7; 135 UB(X); // X is undefined! 136 } 137 138In this example an initialized value is passed to an ``out`` parameter. 139Parameters marked ``out`` are not initialized by the argument expression or 140implicitly by the function. They must be explicitly initialized. In this case 141the argument is not initialized in the function so the temporary is still 142uninitialized when it is copied back to the argument expression. This is 143undefined behavior in HLSL, and any use of the argument after the call is a use 144of an undefined value which may be illegal in the target (DXIL programs with 145used or potentially used ``undef`` or ``poison`` values fail validation). 146 147Clang Implementation 148==================== 149 150.. note:: 151 152 The implementation described here is a proposal. It has not yet been fully 153 implemented, so the current state of Clang's sources may not reflect this 154 design. A prototype implementation was built on DXC which is Clang-3.7 based. 155 The prototype can be found 156 `here <https://github.com/microsoft/DirectXShaderCompiler/pull/5249>`_. A lot 157 of the changes in the prototype implementation are restoring Clang-3.7 code 158 that was previously modified to its original state. 159 160The implementation in clang adds a new non-decaying array type, a new AST node 161to represent output parameters, and minor extensions to Clang's existing support 162for Objective-C write-back arguments. The goal of this design is to capture the 163semantic details of HLSL function calls in the AST, and minimize the amount of 164magic that needs to occur during IR generation. 165 166Array Temporaries 167----------------- 168 169The new ``ArrayParameterType`` is a sub-class of ``ConstantArrayType`` 170inheriting all the behaviors and methods of the parent except that it does not 171decay to a pointer during overload resolution or template type deduction. 172 173An argument of ``ConstantArrayType`` can be implicitly converted to an 174equivalent non-decayed ``ArrayParameterType`` if the underlying canonical 175``ConstantArrayType`` is the same. This occurs during overload resolution 176instead of array to pointer decay. 177 178.. code-block:: c++ 179 180 void SizedArray(float a[4]); 181 void UnsizedArray(float a[]); 182 183 void main() { 184 float arr[4] = {1, 1, 1, 1}; 185 SizedArray(arr); 186 UnsizedArray(arr); 187 } 188 189In the example above, the following AST is generated for the call to 190``SizedArray``: 191 192.. code-block:: text 193 194 CallExpr 'void' 195 |-ImplicitCastExpr 'void (*)(float [4])' <FunctionToPointerDecay> 196 | `-DeclRefExpr 'void (float [4])' lvalue Function 'SizedArray' 'void (float [4])' 197 `-ImplicitCastExpr 'float [4]' <HLSLArrayRValue> 198 `-DeclRefExpr 'float [4]' lvalue Var 'arr' 'float [4]' 199 200In the example above, the following AST is generated for the call to 201``UnsizedArray``: 202 203.. code-block:: text 204 205 CallExpr 'void' 206 |-ImplicitCastExpr 'void (*)(float [])' <FunctionToPointerDecay> 207 | `-DeclRefExpr 'void (float [])' lvalue Function 'UnsizedArray' 'void (float [])' 208 `-ImplicitCastExpr 'float [4]' <HLSLArrayRValue> 209 `-DeclRefExpr 'float [4]' lvalue Var 'arr' 'float [4]' 210 211In both of these cases the argument expression is of known array size so we can 212initialize an appropriately sized temporary. 213 214It is illegal in HLSL to convert an unsized array to a sized array: 215 216.. code-block:: c++ 217 218 void SizedArray(float a[4]); 219 void UnsizedArray(float a[]) { 220 SizedArray(a); // Cannot convert float[] to float[4] 221 } 222 223When converting a sized array to an unsized array, an array temporary can also 224be inserted. Given the following code: 225 226.. code-block:: c++ 227 228 void UnsizedArray(float a[]); 229 void SizedArray(float a[4]) { 230 UnsizedArray(a); 231 } 232 233An expected AST should be something like: 234 235.. code-block:: text 236 237 CallExpr 'void' 238 |-ImplicitCastExpr 'void (*)(float [])' <FunctionToPointerDecay> 239 | `-DeclRefExpr 'void (float [])' lvalue Function 'UnsizedArray' 'void (float [])' 240 `-ImplicitCastExpr 'float [4]' <HLSLArrayRValue> 241 `-DeclRefExpr 'float [4]' lvalue Var 'arr' 'float [4]' 242 243Out Parameter Temporaries 244------------------------- 245 246Output parameters are defined in HLSL as *casting expiring values* (cx-values), 247which is a term made up for HLSL. A cx-value is a temporary value which may be 248the result of a cast, and stores its value back to an lvalue when the value 249expires. 250 251To represent this concept in Clang we introduce a new ``HLSLOutParamExpr``. An 252``HLSLOutParamExpr`` has two forms, one with a single sub-expression and one 253with two sub-expressions. 254 255The single sub-expression form is used when the argument expression and the 256function parameter are the same type, so no cast is required. As in this 257example: 258 259.. code-block:: c++ 260 261 void Init(inout int X) { 262 X = 1; 263 } 264 265 void main() { 266 int V; 267 Init(V); 268 } 269 270The expected AST formulation for this code would be something like: 271 272.. code-block:: text 273 274 CallExpr 'void' 275 |-ImplicitCastExpr 'void (*)(int &)' <FunctionToPointerDecay> 276 | `-DeclRefExpr 'void (int &)' lvalue Function 'Init' 'void (int &)' 277 |-HLSLOutParamExpr 'int' lvalue inout 278 `-DeclRefExpr 'int' lvalue Var 'V' 'int' 279 280The ``HLSLOutParamExpr`` captures that the value is ``inout`` vs ``out`` to 281denote whether or not the temporary is initialized from the sub-expression. If 282no casting is required the sub-expression denotes the lvalue expression that the 283cx-value will be copied to when the value expires. 284 285The two sub-expression form of the AST node is required when the argument type 286is not the same as the parameter type. Given this example: 287 288.. code-block:: c++ 289 290 void Trunc(inout int3 V) { } 291 292 293 void main() { 294 float3 F = {1.5, 2.6, 3.3}; 295 Trunc(F); 296 } 297 298For this case the ``HLSLOutParamExpr`` will have sub-expressions to record both 299casting expression sequences for the initialization and write back: 300 301.. code-block:: text 302 303 -CallExpr 'void' 304 |-ImplicitCastExpr 'void (*)(int3 &)' <FunctionToPointerDecay> 305 | `-DeclRefExpr 'void (int3 &)' lvalue Function 'inc_i32' 'void (int3 &)' 306 `-HLSLOutParamExpr 'int3' lvalue inout 307 |-ImplicitCastExpr 'float3' <IntegralToFloating> 308 | `-ImplicitCastExpr 'int3' <LValueToRValue> 309 | `-OpaqueValueExpr 'int3' lvalue 310 `-ImplicitCastExpr 'int3' <FloatingToIntegral> 311 `-ImplicitCastExpr 'float3' <LValueToRValue> 312 `-DeclRefExpr 'float3' lvalue 'F' 'float3' 313 314In this formation the write-back casts are captured as the first sub-expression 315and they cast from an ``OpaqueValueExpr``. In IR generation we can use the 316``OpaqueValueExpr`` as a placeholder for the ``HLSLOutParamExpr``'s temporary 317value on function return. 318 319In code generation this can be implemented with some targeted extensions to the 320Objective-C write-back support. Specifically extending CGCall.cpp's 321``EmitWriteback`` function to support casting expressions and emission of 322aggregate lvalues. 323