1===================================== 2Coroutines in LLVM 3===================================== 4 5.. contents:: 6 :local: 7 :depth: 3 8 9.. warning:: 10 Compatibility across LLVM releases is not guaranteed. 11 12Introduction 13============ 14 15.. _coroutine handle: 16 17LLVM coroutines are functions that have one or more `suspend points`_. 18When a suspend point is reached, the execution of a coroutine is suspended and 19control is returned back to its caller. A suspended coroutine can be resumed 20to continue execution from the last suspend point or it can be destroyed. 21 22In the following example, we call function `f` (which may or may not be a 23coroutine itself) that returns a handle to a suspended coroutine 24(**coroutine handle**) that is used by `main` to resume the coroutine twice and 25then destroy it: 26 27.. code-block:: llvm 28 29 define i32 @main() { 30 entry: 31 %hdl = call ptr @f(i32 4) 32 call void @llvm.coro.resume(ptr %hdl) 33 call void @llvm.coro.resume(ptr %hdl) 34 call void @llvm.coro.destroy(ptr %hdl) 35 ret i32 0 36 } 37 38.. _coroutine frame: 39 40In addition to the function stack frame which exists when a coroutine is 41executing, there is an additional region of storage that contains objects that 42keep the coroutine state when a coroutine is suspended. This region of storage 43is called the **coroutine frame**. It is created when a coroutine is called 44and destroyed when a coroutine either runs to completion or is destroyed 45while suspended. 46 47LLVM currently supports two styles of coroutine lowering. These styles 48support substantially different sets of features, have substantially 49different ABIs, and expect substantially different patterns of frontend 50code generation. However, the styles also have a great deal in common. 51 52In all cases, an LLVM coroutine is initially represented as an ordinary LLVM 53function that has calls to `coroutine intrinsics`_ defining the structure of 54the coroutine. The coroutine function is then, in the most general case, 55rewritten by the coroutine lowering passes to become the "ramp function", 56the initial entrypoint of the coroutine, which executes until a suspend point 57is first reached. The remainder of the original coroutine function is split 58out into some number of "resume functions". Any state which must persist 59across suspensions is stored in the coroutine frame. The resume functions 60must somehow be able to handle either a "normal" resumption, which continues 61the normal execution of the coroutine, or an "abnormal" resumption, which 62must unwind the coroutine without attempting to suspend it. 63 64Switched-Resume Lowering 65------------------------ 66 67In LLVM's standard switched-resume lowering, signaled by the use of 68`llvm.coro.id`, the coroutine frame is stored as part of a "coroutine 69object" which represents a handle to a particular invocation of the 70coroutine. All coroutine objects support a common ABI allowing certain 71features to be used without knowing anything about the coroutine's 72implementation: 73 74- A coroutine object can be queried to see if it has reached completion 75 with `llvm.coro.done`. 76 77- A coroutine object can be resumed normally if it has not already reached 78 completion with `llvm.coro.resume`. 79 80- A coroutine object can be destroyed, invalidating the coroutine object, 81 with `llvm.coro.destroy`. This must be done separately even if the 82 coroutine has reached completion normally. 83 84- "Promise" storage, which is known to have a certain size and alignment, 85 can be projected out of the coroutine object with `llvm.coro.promise`. 86 The coroutine implementation must have been compiled to define a promise 87 of the same size and alignment. 88 89In general, interacting with a coroutine object in any of these ways while 90it is running has undefined behavior. 91 92The coroutine function is split into three functions, representing three 93different ways that control can enter the coroutine: 94 951. the ramp function that is initially invoked, which takes arbitrary 96 arguments and returns a pointer to the coroutine object; 97 982. a coroutine resume function that is invoked when the coroutine is resumed, 99 which takes a pointer to the coroutine object and returns `void`; 100 1013. a coroutine destroy function that is invoked when the coroutine is 102 destroyed, which takes a pointer to the coroutine object and returns 103 `void`. 104 105Because the resume and destroy functions are shared across all suspend 106points, suspend points must store the index of the active suspend in 107the coroutine object, and the resume/destroy functions must switch over 108that index to get back to the correct point. Hence the name of this 109lowering. 110 111Pointers to the resume and destroy functions are stored in the coroutine 112object at known offsets which are fixed for all coroutines. A completed 113coroutine is represented with a null resume function. 114 115There is a somewhat complex protocol of intrinsics for allocating and 116deallocating the coroutine object. It is complex in order to allow the 117allocation to be elided due to inlining. This protocol is discussed 118in further detail below. 119 120The frontend may generate code to call the coroutine function directly; 121this will become a call to the ramp function and will return a pointer 122to the coroutine object. The frontend should always resume or destroy 123the coroutine using the corresponding intrinsics. 124 125Returned-Continuation Lowering 126------------------------------ 127 128In returned-continuation lowering, signaled by the use of 129`llvm.coro.id.retcon` or `llvm.coro.id.retcon.once`, some aspects of 130the ABI must be handled more explicitly by the frontend. 131 132In this lowering, every suspend point takes a list of "yielded values" 133which are returned back to the caller along with a function pointer, 134called the continuation function. The coroutine is resumed by simply 135calling this continuation function pointer. The original coroutine 136is divided into the ramp function and then an arbitrary number of 137these continuation functions, one for each suspend point. 138 139LLVM actually supports two closely-related returned-continuation 140lowerings: 141 142- In normal returned-continuation lowering, the coroutine may suspend 143 itself multiple times. This means that a continuation function 144 itself returns another continuation pointer, as well as a list of 145 yielded values. 146 147 The coroutine indicates that it has run to completion by returning 148 a null continuation pointer. Any yielded values will be `undef` 149 should be ignored. 150 151- In yield-once returned-continuation lowering, the coroutine must 152 suspend itself exactly once (or throw an exception). The ramp 153 function returns a continuation function pointer and yielded 154 values, the continuation function may optionally return ordinary 155 results when the coroutine has run to completion. 156 157The coroutine frame is maintained in a fixed-size buffer that is 158passed to the `coro.id` intrinsic, which guarantees a certain size 159and alignment statically. The same buffer must be passed to the 160continuation function(s). The coroutine will allocate memory if the 161buffer is insufficient, in which case it will need to store at 162least that pointer in the buffer; therefore the buffer must always 163be at least pointer-sized. How the coroutine uses the buffer may 164vary between suspend points. 165 166In addition to the buffer pointer, continuation functions take an 167argument indicating whether the coroutine is being resumed normally 168(zero) or abnormally (non-zero). 169 170LLVM is currently ineffective at statically eliminating allocations 171after fully inlining returned-continuation coroutines into a caller. 172This may be acceptable if LLVM's coroutine support is primarily being 173used for low-level lowering and inlining is expected to be applied 174earlier in the pipeline. 175 176Async Lowering 177-------------- 178 179In async-continuation lowering, signaled by the use of `llvm.coro.id.async`, 180handling of control-flow must be handled explicitly by the frontend. 181 182In this lowering, a coroutine is assumed to take the current `async context` as 183one of its arguments (the argument position is determined by 184`llvm.coro.id.async`). It is used to marshal arguments and return values of the 185coroutine. Therefore an async coroutine returns `void`. 186 187.. code-block:: llvm 188 189 define swiftcc void @async_coroutine(ptr %async.ctxt, ptr, ptr) { 190 } 191 192Values live across a suspend point need to be stored in the coroutine frame to 193be available in the continuation function. This frame is stored as a tail to the 194`async context`. 195 196Every suspend point takes an `context projection function` argument which 197describes how-to obtain the continuations `async context` and every suspend 198point has an associated `resume function` denoted by the 199`llvm.coro.async.resume` intrinsic. The coroutine is resumed by calling this 200`resume function` passing the `async context` as the one of its arguments 201argument. The `resume function` can restore its (the caller's) `async context` 202by applying a `context projection function` that is provided by the frontend as 203a parameter to the `llvm.coro.suspend.async` intrinsic. 204 205.. code-block:: c 206 207 // For example: 208 struct async_context { 209 struct async_context *caller_context; 210 ... 211 } 212 213 char *context_projection_function(struct async_context *callee_ctxt) { 214 return callee_ctxt->caller_context; 215 } 216 217.. code-block:: llvm 218 219 %resume_func_ptr = call ptr @llvm.coro.async.resume() 220 call {ptr, ptr, ptr} (ptr, ptr, ...) @llvm.coro.suspend.async( 221 ptr %resume_func_ptr, 222 ptr %context_projection_function 223 224The frontend should provide a `async function pointer` struct associated with 225each async coroutine by `llvm.coro.id.async`'s argument. The initial size and 226alignment of the `async context` must be provided as arguments to the 227`llvm.coro.id.async` intrinsic. Lowering will update the size entry with the 228coroutine frame requirements. The frontend is responsible for allocating the 229memory for the `async context` but can use the `async function pointer` struct 230to obtain the required size. 231 232.. code-block:: c 233 234 struct async_function_pointer { 235 uint32_t relative_function_pointer_to_async_impl; 236 uint32_t context_size; 237 } 238 239Lowering will split an async coroutine into a ramp function and one resume 240function per suspend point. 241 242How control-flow is passed between caller, suspension point, and back to 243resume function is left up to the frontend. 244 245The suspend point takes a function and its arguments. The function is intended 246to model the transfer to the callee function. It will be tail called by 247lowering and therefore must have the same signature and calling convention as 248the async coroutine. 249 250.. code-block:: llvm 251 252 call {ptr, ptr, ptr} (ptr, ptr, ...) @llvm.coro.suspend.async( 253 ptr %resume_func_ptr, 254 ptr %context_projection_function, 255 ptr %suspend_function, 256 ptr %arg1, ptr %arg2, i8 %arg3) 257 258Coroutines by Example 259===================== 260 261The examples below are all of switched-resume coroutines. 262 263Coroutine Representation 264------------------------ 265 266Let's look at an example of an LLVM coroutine with the behavior sketched 267by the following pseudo-code. 268 269.. code-block:: c++ 270 271 void *f(int n) { 272 for(;;) { 273 print(n++); 274 <suspend> // returns a coroutine handle on first suspend 275 } 276 } 277 278This coroutine calls some function `print` with value `n` as an argument and 279suspends execution. Every time this coroutine resumes, it calls `print` again with an argument one bigger than the last time. This coroutine never completes by itself and must be destroyed explicitly. If we use this coroutine with 280a `main` shown in the previous section. It will call `print` with values 4, 5 281and 6 after which the coroutine will be destroyed. 282 283The LLVM IR for this coroutine looks like this: 284 285.. code-block:: llvm 286 287 define ptr @f(i32 %n) presplitcoroutine { 288 entry: 289 %id = call token @llvm.coro.id(i32 0, ptr null, ptr null, ptr null) 290 %size = call i32 @llvm.coro.size.i32() 291 %alloc = call ptr @malloc(i32 %size) 292 %hdl = call noalias ptr @llvm.coro.begin(token %id, ptr %alloc) 293 br label %loop 294 loop: 295 %n.val = phi i32 [ %n, %entry ], [ %inc, %loop ] 296 %inc = add nsw i32 %n.val, 1 297 call void @print(i32 %n.val) 298 %0 = call i8 @llvm.coro.suspend(token none, i1 false) 299 switch i8 %0, label %suspend [i8 0, label %loop 300 i8 1, label %cleanup] 301 cleanup: 302 %mem = call ptr @llvm.coro.free(token %id, ptr %hdl) 303 call void @free(ptr %mem) 304 br label %suspend 305 suspend: 306 %unused = call i1 @llvm.coro.end(ptr %hdl, i1 false, token none) 307 ret ptr %hdl 308 } 309 310The `entry` block establishes the coroutine frame. The `coro.size`_ intrinsic is 311lowered to a constant representing the size required for the coroutine frame. 312The `coro.begin`_ intrinsic initializes the coroutine frame and returns the 313coroutine handle. The second parameter of `coro.begin` is given a block of memory 314to be used if the coroutine frame needs to be allocated dynamically. 315 316The `coro.id`_ intrinsic serves as coroutine identity useful in cases when the 317`coro.begin`_ intrinsic get duplicated by optimization passes such as 318jump-threading. 319 320The `cleanup` block destroys the coroutine frame. The `coro.free`_ intrinsic, 321given the coroutine handle, returns a pointer of the memory block to be freed or 322`null` if the coroutine frame was not allocated dynamically. The `cleanup` 323block is entered when coroutine runs to completion by itself or destroyed via 324call to the `coro.destroy`_ intrinsic. 325 326The `suspend` block contains code to be executed when coroutine runs to 327completion or suspended. The `coro.end`_ intrinsic marks the point where 328a coroutine needs to return control back to the caller if it is not an initial 329invocation of the coroutine. 330 331The `loop` blocks represents the body of the coroutine. The `coro.suspend`_ 332intrinsic in combination with the following switch indicates what happens to 333control flow when a coroutine is suspended (default case), resumed (case 0) or 334destroyed (case 1). 335 336Coroutine Transformation 337------------------------ 338 339One of the steps of coroutine lowering is building the coroutine frame. The 340def-use chains are analyzed to determine which objects need be kept alive across 341suspend points. In the coroutine shown in the previous section, use of virtual register 342`%inc` is separated from the definition by a suspend point, therefore, it 343cannot reside on the stack frame since the latter goes away once the coroutine 344is suspended and control is returned back to the caller. An i32 slot is 345allocated in the coroutine frame and `%inc` is spilled and reloaded from that 346slot as needed. 347 348We also store addresses of the resume and destroy functions so that the 349`coro.resume` and `coro.destroy` intrinsics can resume and destroy the coroutine 350when its identity cannot be determined statically at compile time. For our 351example, the coroutine frame will be: 352 353.. code-block:: llvm 354 355 %f.frame = type { ptr, ptr, i32 } 356 357After resume and destroy parts are outlined, function `f` will contain only the 358code responsible for creation and initialization of the coroutine frame and 359execution of the coroutine until a suspend point is reached: 360 361.. code-block:: llvm 362 363 define ptr @f(i32 %n) { 364 entry: 365 %id = call token @llvm.coro.id(i32 0, ptr null, ptr null, ptr null) 366 %alloc = call noalias ptr @malloc(i32 24) 367 %frame = call noalias ptr @llvm.coro.begin(token %id, ptr %alloc) 368 %1 = getelementptr %f.frame, ptr %frame, i32 0, i32 0 369 store ptr @f.resume, ptr %1 370 %2 = getelementptr %f.frame, ptr %frame, i32 0, i32 1 371 store ptr @f.destroy, ptr %2 372 373 %inc = add nsw i32 %n, 1 374 %inc.spill.addr = getelementptr inbounds %f.Frame, ptr %FramePtr, i32 0, i32 2 375 store i32 %inc, ptr %inc.spill.addr 376 call void @print(i32 %n) 377 378 ret ptr %frame 379 } 380 381Outlined resume part of the coroutine will reside in function `f.resume`: 382 383.. code-block:: llvm 384 385 define internal fastcc void @f.resume(ptr %frame.ptr.resume) { 386 entry: 387 %inc.spill.addr = getelementptr %f.frame, ptr %frame.ptr.resume, i64 0, i32 2 388 %inc.spill = load i32, ptr %inc.spill.addr, align 4 389 %inc = add i32 %inc.spill, 1 390 store i32 %inc, ptr %inc.spill.addr, align 4 391 tail call void @print(i32 %inc) 392 ret void 393 } 394 395Whereas function `f.destroy` will contain the cleanup code for the coroutine: 396 397.. code-block:: llvm 398 399 define internal fastcc void @f.destroy(ptr %frame.ptr.destroy) { 400 entry: 401 tail call void @free(ptr %frame.ptr.destroy) 402 ret void 403 } 404 405Avoiding Heap Allocations 406------------------------- 407 408A particular coroutine usage pattern, which is illustrated by the `main` 409function in the overview section, where a coroutine is created, manipulated and 410destroyed by the same calling function, is common for coroutines implementing 411RAII idiom and is suitable for allocation elision optimization which avoid 412dynamic allocation by storing the coroutine frame as a static `alloca` in its 413caller. 414 415In the entry block, we will call `coro.alloc`_ intrinsic that will return `true` 416when dynamic allocation is required, and `false` if dynamic allocation is 417elided. 418 419.. code-block:: llvm 420 421 entry: 422 %id = call token @llvm.coro.id(i32 0, ptr null, ptr null, ptr null) 423 %need.dyn.alloc = call i1 @llvm.coro.alloc(token %id) 424 br i1 %need.dyn.alloc, label %dyn.alloc, label %coro.begin 425 dyn.alloc: 426 %size = call i32 @llvm.coro.size.i32() 427 %alloc = call ptr @CustomAlloc(i32 %size) 428 br label %coro.begin 429 coro.begin: 430 %phi = phi ptr [ null, %entry ], [ %alloc, %dyn.alloc ] 431 %hdl = call noalias ptr @llvm.coro.begin(token %id, ptr %phi) 432 433In the cleanup block, we will make freeing the coroutine frame conditional on 434`coro.free`_ intrinsic. If allocation is elided, `coro.free`_ returns `null` 435thus skipping the deallocation code: 436 437.. code-block:: llvm 438 439 cleanup: 440 %mem = call ptr @llvm.coro.free(token %id, ptr %hdl) 441 %need.dyn.free = icmp ne ptr %mem, null 442 br i1 %need.dyn.free, label %dyn.free, label %if.end 443 dyn.free: 444 call void @CustomFree(ptr %mem) 445 br label %if.end 446 if.end: 447 ... 448 449With allocations and deallocations represented as described as above, after 450coroutine heap allocation elision optimization, the resulting main will be: 451 452.. code-block:: llvm 453 454 define i32 @main() { 455 entry: 456 call void @print(i32 4) 457 call void @print(i32 5) 458 call void @print(i32 6) 459 ret i32 0 460 } 461 462Multiple Suspend Points 463----------------------- 464 465Let's consider the coroutine that has more than one suspend point: 466 467.. code-block:: c++ 468 469 void *f(int n) { 470 for(;;) { 471 print(n++); 472 <suspend> 473 print(-n); 474 <suspend> 475 } 476 } 477 478Matching LLVM code would look like (with the rest of the code remaining the same 479as the code in the previous section): 480 481.. code-block:: llvm 482 483 loop: 484 %n.addr = phi i32 [ %n, %entry ], [ %inc, %loop.resume ] 485 call void @print(i32 %n.addr) #4 486 %2 = call i8 @llvm.coro.suspend(token none, i1 false) 487 switch i8 %2, label %suspend [i8 0, label %loop.resume 488 i8 1, label %cleanup] 489 loop.resume: 490 %inc = add nsw i32 %n.addr, 1 491 %sub = xor i32 %n.addr, -1 492 call void @print(i32 %sub) 493 %3 = call i8 @llvm.coro.suspend(token none, i1 false) 494 switch i8 %3, label %suspend [i8 0, label %loop 495 i8 1, label %cleanup] 496 497In this case, the coroutine frame would include a suspend index that will 498indicate at which suspend point the coroutine needs to resume. 499 500.. code-block:: llvm 501 502 %f.frame = type { ptr, ptr, i32, i32 } 503 504The resume function will use an index to jump to an appropriate basic block and will look 505as follows: 506 507.. code-block:: llvm 508 509 define internal fastcc void @f.Resume(ptr %FramePtr) { 510 entry.Resume: 511 %index.addr = getelementptr inbounds %f.Frame, ptr %FramePtr, i64 0, i32 2 512 %index = load i8, ptr %index.addr, align 1 513 %switch = icmp eq i8 %index, 0 514 %n.addr = getelementptr inbounds %f.Frame, ptr %FramePtr, i64 0, i32 3 515 %n = load i32, ptr %n.addr, align 4 516 517 br i1 %switch, label %loop.resume, label %loop 518 519 loop.resume: 520 %sub = sub nsw i32 0, %n 521 call void @print(i32 %sub) 522 br label %suspend 523 loop: 524 %inc = add nsw i32 %n, 1 525 store i32 %inc, ptr %n.addr, align 4 526 tail call void @print(i32 %inc) 527 br label %suspend 528 529 suspend: 530 %storemerge = phi i8 [ 0, %loop ], [ 1, %loop.resume ] 531 store i8 %storemerge, ptr %index.addr, align 1 532 ret void 533 } 534 535If different cleanup code needs to get executed for different suspend points, 536a similar switch will be in the `f.destroy` function. 537 538.. note :: 539 540 Using suspend index in a coroutine state and having a switch in `f.resume` and 541 `f.destroy` is one of the possible implementation strategies. We explored 542 another option where a distinct `f.resume1`, `f.resume2`, etc. are created for 543 every suspend point, and instead of storing an index, the resume and destroy 544 function pointers are updated at every suspend. Early testing showed that the 545 current approach is easier on the optimizer than the latter so it is a 546 lowering strategy implemented at the moment. 547 548Distinct Save and Suspend 549------------------------- 550 551In the previous example, setting a resume index (or some other state change that 552needs to happen to prepare a coroutine for resumption) happens at the same time as 553a suspension of a coroutine. However, in certain cases, it is necessary to control 554when coroutine is prepared for resumption and when it is suspended. 555 556In the following example, a coroutine represents some activity that is driven 557by completions of asynchronous operations `async_op1` and `async_op2` which get 558a coroutine handle as a parameter and resume the coroutine once async 559operation is finished. 560 561.. code-block:: text 562 563 void g() { 564 for (;;) 565 if (cond()) { 566 async_op1(<coroutine-handle>); // will resume once async_op1 completes 567 <suspend> 568 do_one(); 569 } 570 else { 571 async_op2(<coroutine-handle>); // will resume once async_op2 completes 572 <suspend> 573 do_two(); 574 } 575 } 576 } 577 578In this case, coroutine should be ready for resumption prior to a call to 579`async_op1` and `async_op2`. The `coro.save`_ intrinsic is used to indicate a 580point when coroutine should be ready for resumption (namely, when a resume index 581should be stored in the coroutine frame, so that it can be resumed at the 582correct resume point): 583 584.. code-block:: llvm 585 586 if.true: 587 %save1 = call token @llvm.coro.save(ptr %hdl) 588 call void @async_op1(ptr %hdl) 589 %suspend1 = call i1 @llvm.coro.suspend(token %save1, i1 false) 590 switch i8 %suspend1, label %suspend [i8 0, label %resume1 591 i8 1, label %cleanup] 592 if.false: 593 %save2 = call token @llvm.coro.save(ptr %hdl) 594 call void @async_op2(ptr %hdl) 595 %suspend2 = call i1 @llvm.coro.suspend(token %save2, i1 false) 596 switch i8 %suspend2, label %suspend [i8 0, label %resume2 597 i8 1, label %cleanup] 598 599.. _coroutine promise: 600 601Coroutine Promise 602----------------- 603 604A coroutine author or a frontend may designate a distinguished `alloca` that can 605be used to communicate with the coroutine. This distinguished alloca is called 606**coroutine promise** and is provided as the second parameter to the 607`coro.id`_ intrinsic. 608 609The following coroutine designates a 32 bit integer `promise` and uses it to 610store the current value produced by a coroutine. 611 612.. code-block:: llvm 613 614 define ptr @f(i32 %n) { 615 entry: 616 %promise = alloca i32 617 %id = call token @llvm.coro.id(i32 0, ptr %promise, ptr null, ptr null) 618 %need.dyn.alloc = call i1 @llvm.coro.alloc(token %id) 619 br i1 %need.dyn.alloc, label %dyn.alloc, label %coro.begin 620 dyn.alloc: 621 %size = call i32 @llvm.coro.size.i32() 622 %alloc = call ptr @malloc(i32 %size) 623 br label %coro.begin 624 coro.begin: 625 %phi = phi ptr [ null, %entry ], [ %alloc, %dyn.alloc ] 626 %hdl = call noalias ptr @llvm.coro.begin(token %id, ptr %phi) 627 br label %loop 628 loop: 629 %n.val = phi i32 [ %n, %coro.begin ], [ %inc, %loop ] 630 %inc = add nsw i32 %n.val, 1 631 store i32 %n.val, ptr %promise 632 %0 = call i8 @llvm.coro.suspend(token none, i1 false) 633 switch i8 %0, label %suspend [i8 0, label %loop 634 i8 1, label %cleanup] 635 cleanup: 636 %mem = call ptr @llvm.coro.free(token %id, ptr %hdl) 637 call void @free(ptr %mem) 638 br label %suspend 639 suspend: 640 %unused = call i1 @llvm.coro.end(ptr %hdl, i1 false, token none) 641 ret ptr %hdl 642 } 643 644A coroutine consumer can rely on the `coro.promise`_ intrinsic to access the 645coroutine promise. 646 647.. code-block:: llvm 648 649 define i32 @main() { 650 entry: 651 %hdl = call ptr @f(i32 4) 652 %promise.addr = call ptr @llvm.coro.promise(ptr %hdl, i32 4, i1 false) 653 %val0 = load i32, ptr %promise.addr 654 call void @print(i32 %val0) 655 call void @llvm.coro.resume(ptr %hdl) 656 %val1 = load i32, ptr %promise.addr 657 call void @print(i32 %val1) 658 call void @llvm.coro.resume(ptr %hdl) 659 %val2 = load i32, ptr %promise.addr 660 call void @print(i32 %val2) 661 call void @llvm.coro.destroy(ptr %hdl) 662 ret i32 0 663 } 664 665After example in this section is compiled, result of the compilation will be: 666 667.. code-block:: llvm 668 669 define i32 @main() { 670 entry: 671 tail call void @print(i32 4) 672 tail call void @print(i32 5) 673 tail call void @print(i32 6) 674 ret i32 0 675 } 676 677.. _final: 678.. _final suspend: 679 680Final Suspend 681------------- 682 683A coroutine author or a frontend may designate a particular suspend to be final, 684by setting the second argument of the `coro.suspend`_ intrinsic to `true`. 685Such a suspend point has two properties: 686 687* it is possible to check whether a suspended coroutine is at the final suspend 688 point via `coro.done`_ intrinsic; 689 690* a resumption of a coroutine stopped at the final suspend point leads to 691 undefined behavior. The only possible action for a coroutine at a final 692 suspend point is destroying it via `coro.destroy`_ intrinsic. 693 694From the user perspective, the final suspend point represents an idea of a 695coroutine reaching the end. From the compiler perspective, it is an optimization 696opportunity for reducing number of resume points (and therefore switch cases) in 697the resume function. 698 699The following is an example of a function that keeps resuming the coroutine 700until the final suspend point is reached after which point the coroutine is 701destroyed: 702 703.. code-block:: llvm 704 705 define i32 @main() { 706 entry: 707 %hdl = call ptr @f(i32 4) 708 br label %while 709 while: 710 call void @llvm.coro.resume(ptr %hdl) 711 %done = call i1 @llvm.coro.done(ptr %hdl) 712 br i1 %done, label %end, label %while 713 end: 714 call void @llvm.coro.destroy(ptr %hdl) 715 ret i32 0 716 } 717 718Usually, final suspend point is a frontend injected suspend point that does not 719correspond to any explicitly authored suspend point of the high level language. 720For example, for a Python generator that has only one suspend point: 721 722.. code-block:: python 723 724 def coroutine(n): 725 for i in range(n): 726 yield i 727 728Python frontend would inject two more suspend points, so that the actual code 729looks like this: 730 731.. code-block:: c 732 733 void* coroutine(int n) { 734 int current_value; 735 <designate current_value to be coroutine promise> 736 <SUSPEND> // injected suspend point, so that the coroutine starts suspended 737 for (int i = 0; i < n; ++i) { 738 current_value = i; <SUSPEND>; // corresponds to "yield i" 739 } 740 <SUSPEND final=true> // injected final suspend point 741 } 742 743and python iterator `__next__` would look like: 744 745.. code-block:: c++ 746 747 int __next__(void* hdl) { 748 coro.resume(hdl); 749 if (coro.done(hdl)) throw StopIteration(); 750 return *(int*)coro.promise(hdl, 4, false); 751 } 752 753Custom ABIs and Plugin Libraries 754-------------------------------- 755 756Plugin libraries can extend coroutine lowering enabling a wide variety of users 757to utilize the coroutine transformation passes. An existing coroutine lowering 758is extended by: 759 760#. defining custom ABIs that inherit from the existing ABIs, 761#. give a list of generators for the custom ABIs when constructing the `CoroSplit`_ pass, and 762#. use `coro.begin.custom.abi`_ in place of `coro.begin`_ that has an additional parameter for the index of the generator/ABI to be used for the coroutine. 763 764A custom ABI overriding the SwitchABI's materialization looks like: 765 766.. code-block:: c++ 767 768 class CustomSwitchABI : public coro::SwitchABI { 769 public: 770 CustomSwitchABI(Function &F, coro::Shape &S) 771 : coro::SwitchABI(F, S, ExtraMaterializable) {} 772 }; 773 774Giving a list of custom ABI generators while constructing the `CoroSplit` 775pass looks like: 776 777.. code-block:: c++ 778 779 CoroSplitPass::BaseABITy GenCustomABI = [](Function &F, coro::Shape &S) { 780 return std::make_unique<CustomSwitchABI>(F, S); 781 }; 782 783 CGSCCPassManager CGPM; 784 CGPM.addPass(CoroSplitPass({GenCustomABI})); 785 786The LLVM IR for a coroutine using a Coroutine with a custom ABI looks like: 787 788.. code-block:: llvm 789 790 define ptr @f(i32 %n) presplitcoroutine_custom_abi { 791 entry: 792 %id = call token @llvm.coro.id(i32 0, ptr null, ptr null, ptr null) 793 %size = call i32 @llvm.coro.size.i32() 794 %alloc = call ptr @malloc(i32 %size) 795 %hdl = call noalias ptr @llvm.coro.begin.custom.abi(token %id, ptr %alloc, i32 0) 796 br label %loop 797 loop: 798 %n.val = phi i32 [ %n, %entry ], [ %inc, %loop ] 799 %inc = add nsw i32 %n.val, 1 800 call void @print(i32 %n.val) 801 %0 = call i8 @llvm.coro.suspend(token none, i1 false) 802 switch i8 %0, label %suspend [i8 0, label %loop 803 i8 1, label %cleanup] 804 cleanup: 805 %mem = call ptr @llvm.coro.free(token %id, ptr %hdl) 806 call void @free(ptr %mem) 807 br label %suspend 808 suspend: 809 %unused = call i1 @llvm.coro.end(ptr %hdl, i1 false, token none) 810 ret ptr %hdl 811 } 812 813Parameter Attributes 814==================== 815Some parameter attributes, used to communicate additional information about the result or parameters of a function, require special handling. 816 817ByVal 818----- 819A ByVal parameter on an argument indicates that the pointee should be treated as being passed by value to the function. 820Prior to the coroutine transforms loads and stores to/from the pointer are generated where the value is needed. 821Consequently, a ByVal argument is treated much like an alloca. 822Space is allocated for it on the coroutine frame and the uses of the argument pointer are replaced with a pointer to the coroutine frame. 823 824Swift Error 825----------- 826Clang supports the swiftcall calling convention in many common targets, and a user could call a function that takes a swifterror argument from a C++ coroutine. 827The swifterror parameter attribute exists to model and optimize Swift error handling. 828A swifterror alloca or parameter can only be loaded, stored, or passed as a swifterror call argument, and a swifterror call argument can only be a direct reference to a swifterror alloca or parameter. 829These rules, not coincidentally, mean that you can always perfectly model the data flow in the alloca, and LLVM CodeGen actually has to do that in order to emit code. 830 831For coroutine lowering the default treatment of allocas breaks those rules — splitting will try to replace the alloca with an entry in the coro frame, which can lead to trying to pass that as a swifterror argument. 832To pass a swifterror argument in a split function, we need to still have the alloca around; but we also potentially need the coro frame slot, since useful data can (in theory) be stored in the swifterror alloca slot across suspensions in the presplit coroutine. 833When split a coroutine it is consequently necessary to keep both the frame slot as well as the alloca itself and then keep them in sync. 834 835Intrinsics 836========== 837 838Coroutine Manipulation Intrinsics 839--------------------------------- 840 841Intrinsics described in this section are used to manipulate an existing 842coroutine. They can be used in any function which happen to have a pointer 843to a `coroutine frame`_ or a pointer to a `coroutine promise`_. 844 845.. _coro.destroy: 846 847'llvm.coro.destroy' Intrinsic 848^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 849 850Syntax: 851""""""" 852 853:: 854 855 declare void @llvm.coro.destroy(ptr <handle>) 856 857Overview: 858""""""""" 859 860The '``llvm.coro.destroy``' intrinsic destroys a suspended 861switched-resume coroutine. 862 863Arguments: 864"""""""""" 865 866The argument is a coroutine handle to a suspended coroutine. 867 868Semantics: 869"""""""""" 870 871When possible, the `coro.destroy` intrinsic is replaced with a direct call to 872the coroutine destroy function. Otherwise it is replaced with an indirect call 873based on the function pointer for the destroy function stored in the coroutine 874frame. Destroying a coroutine that is not suspended leads to undefined behavior. 875 876.. _coro.resume: 877 878'llvm.coro.resume' Intrinsic 879^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 880 881:: 882 883 declare void @llvm.coro.resume(ptr <handle>) 884 885Overview: 886""""""""" 887 888The '``llvm.coro.resume``' intrinsic resumes a suspended switched-resume coroutine. 889 890Arguments: 891"""""""""" 892 893The argument is a handle to a suspended coroutine. 894 895Semantics: 896"""""""""" 897 898When possible, the `coro.resume` intrinsic is replaced with a direct call to the 899coroutine resume function. Otherwise it is replaced with an indirect call based 900on the function pointer for the resume function stored in the coroutine frame. 901Resuming a coroutine that is not suspended leads to undefined behavior. 902 903.. _coro.done: 904 905'llvm.coro.done' Intrinsic 906^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 907 908:: 909 910 declare i1 @llvm.coro.done(ptr <handle>) 911 912Overview: 913""""""""" 914 915The '``llvm.coro.done``' intrinsic checks whether a suspended 916switched-resume coroutine is at the final suspend point or not. 917 918Arguments: 919"""""""""" 920 921The argument is a handle to a suspended coroutine. 922 923Semantics: 924"""""""""" 925 926Using this intrinsic on a coroutine that does not have a `final suspend`_ point 927or on a coroutine that is not suspended leads to undefined behavior. 928 929.. _coro.promise: 930 931'llvm.coro.promise' Intrinsic 932^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 933 934:: 935 936 declare ptr @llvm.coro.promise(ptr <ptr>, i32 <alignment>, i1 <from>) 937 938Overview: 939""""""""" 940 941The '``llvm.coro.promise``' intrinsic obtains a pointer to a 942`coroutine promise`_ given a switched-resume coroutine handle and vice versa. 943 944Arguments: 945"""""""""" 946 947The first argument is a handle to a coroutine if `from` is false. Otherwise, 948it is a pointer to a coroutine promise. 949 950The second argument is an alignment requirements of the promise. 951If a frontend designated `%promise = alloca i32` as a promise, the alignment 952argument to `coro.promise` should be the alignment of `i32` on the target 953platform. If a frontend designated `%promise = alloca i32, align 16` as a 954promise, the alignment argument should be 16. 955This argument only accepts constants. 956 957The third argument is a boolean indicating a direction of the transformation. 958If `from` is true, the intrinsic returns a coroutine handle given a pointer 959to a promise. If `from` is false, the intrinsics return a pointer to a promise 960from a coroutine handle. This argument only accepts constants. 961 962Semantics: 963"""""""""" 964 965Using this intrinsic on a coroutine that does not have a coroutine promise 966leads to undefined behavior. It is possible to read and modify coroutine 967promise of the coroutine which is currently executing. The coroutine author and 968a coroutine user are responsible to makes sure there is no data races. 969 970Example: 971"""""""" 972 973.. code-block:: llvm 974 975 define ptr @f(i32 %n) { 976 entry: 977 %promise = alloca i32 978 ; the second argument to coro.id points to the coroutine promise. 979 %id = call token @llvm.coro.id(i32 0, ptr %promise, ptr null, ptr null) 980 ... 981 %hdl = call noalias ptr @llvm.coro.begin(token %id, ptr %alloc) 982 ... 983 store i32 42, ptr %promise ; store something into the promise 984 ... 985 ret ptr %hdl 986 } 987 988 define i32 @main() { 989 entry: 990 %hdl = call ptr @f(i32 4) ; starts the coroutine and returns its handle 991 %promise.addr = call ptr @llvm.coro.promise(ptr %hdl, i32 4, i1 false) 992 %val = load i32, ptr %promise.addr ; load a value from the promise 993 call void @print(i32 %val) 994 call void @llvm.coro.destroy(ptr %hdl) 995 ret i32 0 996 } 997 998.. _coroutine intrinsics: 999 1000Coroutine Structure Intrinsics 1001------------------------------ 1002Intrinsics described in this section are used within a coroutine to describe 1003the coroutine structure. They should not be used outside of a coroutine. 1004 1005.. _coro.size: 1006 1007'llvm.coro.size' Intrinsic 1008^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1009:: 1010 1011 declare i32 @llvm.coro.size.i32() 1012 declare i64 @llvm.coro.size.i64() 1013 1014Overview: 1015""""""""" 1016 1017The '``llvm.coro.size``' intrinsic returns the number of bytes 1018required to store a `coroutine frame`_. This is only supported for 1019switched-resume coroutines. 1020 1021Arguments: 1022"""""""""" 1023 1024None 1025 1026Semantics: 1027"""""""""" 1028 1029The `coro.size` intrinsic is lowered to a constant representing the size of 1030the coroutine frame. 1031 1032.. _coro.align: 1033 1034'llvm.coro.align' Intrinsic 1035^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1036:: 1037 1038 declare i32 @llvm.coro.align.i32() 1039 declare i64 @llvm.coro.align.i64() 1040 1041Overview: 1042""""""""" 1043 1044The '``llvm.coro.align``' intrinsic returns the alignment of a `coroutine frame`_. 1045This is only supported for switched-resume coroutines. 1046 1047Arguments: 1048"""""""""" 1049 1050None 1051 1052Semantics: 1053"""""""""" 1054 1055The `coro.align` intrinsic is lowered to a constant representing the alignment of 1056the coroutine frame. 1057 1058.. _coro.begin: 1059 1060'llvm.coro.begin' Intrinsic 1061^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1062:: 1063 1064 declare ptr @llvm.coro.begin(token <id>, ptr <mem>) 1065 1066Overview: 1067""""""""" 1068 1069The '``llvm.coro.begin``' intrinsic returns an address of the coroutine frame. 1070 1071Arguments: 1072"""""""""" 1073 1074The first argument is a token returned by a call to '``llvm.coro.id``' 1075identifying the coroutine. 1076 1077The second argument is a pointer to a block of memory where coroutine frame 1078will be stored if it is allocated dynamically. This pointer is ignored 1079for returned-continuation coroutines. 1080 1081Semantics: 1082"""""""""" 1083 1084Depending on the alignment requirements of the objects in the coroutine frame 1085and/or on the codegen compactness reasons the pointer returned from `coro.begin` 1086may be at offset to the `%mem` argument. (This could be beneficial if 1087instructions that express relative access to data can be more compactly encoded 1088with small positive and negative offsets). 1089 1090A frontend should emit exactly one `coro.begin` intrinsic per coroutine. 1091 1092.. _coro.begin.custom.abi: 1093 1094'llvm.coro.begin.custom.abi' Intrinsic 1095^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1096:: 1097 1098 declare ptr @llvm.coro.begin.custom.abi(token <id>, ptr <mem>, i32) 1099 1100Overview: 1101""""""""" 1102 1103The '``llvm.coro.begin.custom.abi``' intrinsic is used in place of the 1104`coro.begin` intrinsic that has an additional parameter to specify the custom 1105ABI for the coroutine. The return is identical to that of the `coro.begin` 1106intrinsic. 1107 1108Arguments: 1109"""""""""" 1110 1111The first and second arguments are identical to those of the `coro.begin` 1112intrinsic. 1113 1114The third argument is an i32 index of the generator list given to the 1115`CoroSplit` pass specifying the custom ABI generator for this coroutine. 1116 1117Semantics: 1118"""""""""" 1119 1120The semantics are identical to those of the `coro.begin` intrinsic. 1121 1122.. _coro.free: 1123 1124'llvm.coro.free' Intrinsic 1125^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1126:: 1127 1128 declare ptr @llvm.coro.free(token %id, ptr <frame>) 1129 1130Overview: 1131""""""""" 1132 1133The '``llvm.coro.free``' intrinsic returns a pointer to a block of memory where 1134coroutine frame is stored or `null` if this instance of a coroutine did not use 1135dynamically allocated memory for its coroutine frame. This intrinsic is not 1136supported for returned-continuation coroutines. 1137 1138Arguments: 1139"""""""""" 1140 1141The first argument is a token returned by a call to '``llvm.coro.id``' 1142identifying the coroutine. 1143 1144The second argument is a pointer to the coroutine frame. This should be the same 1145pointer that was returned by prior `coro.begin` call. 1146 1147Example (custom deallocation function): 1148""""""""""""""""""""""""""""""""""""""" 1149 1150.. code-block:: llvm 1151 1152 cleanup: 1153 %mem = call ptr @llvm.coro.free(token %id, ptr %frame) 1154 %mem_not_null = icmp ne ptr %mem, null 1155 br i1 %mem_not_null, label %if.then, label %if.end 1156 if.then: 1157 call void @CustomFree(ptr %mem) 1158 br label %if.end 1159 if.end: 1160 ret void 1161 1162Example (standard deallocation functions): 1163"""""""""""""""""""""""""""""""""""""""""" 1164 1165.. code-block:: llvm 1166 1167 cleanup: 1168 %mem = call ptr @llvm.coro.free(token %id, ptr %frame) 1169 call void @free(ptr %mem) 1170 ret void 1171 1172.. _coro.alloc: 1173 1174'llvm.coro.alloc' Intrinsic 1175^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1176:: 1177 1178 declare i1 @llvm.coro.alloc(token <id>) 1179 1180Overview: 1181""""""""" 1182 1183The '``llvm.coro.alloc``' intrinsic returns `true` if dynamic allocation is 1184required to obtain a memory for the coroutine frame and `false` otherwise. 1185This is not supported for returned-continuation coroutines. 1186 1187Arguments: 1188"""""""""" 1189 1190The first argument is a token returned by a call to '``llvm.coro.id``' 1191identifying the coroutine. 1192 1193Semantics: 1194"""""""""" 1195 1196A frontend should emit at most one `coro.alloc` intrinsic per coroutine. 1197The intrinsic is used to suppress dynamic allocation of the coroutine frame 1198when possible. 1199 1200Example: 1201"""""""" 1202 1203.. code-block:: llvm 1204 1205 entry: 1206 %id = call token @llvm.coro.id(i32 0, ptr null, ptr null, ptr null) 1207 %dyn.alloc.required = call i1 @llvm.coro.alloc(token %id) 1208 br i1 %dyn.alloc.required, label %coro.alloc, label %coro.begin 1209 1210 coro.alloc: 1211 %frame.size = call i32 @llvm.coro.size() 1212 %alloc = call ptr @MyAlloc(i32 %frame.size) 1213 br label %coro.begin 1214 1215 coro.begin: 1216 %phi = phi ptr [ null, %entry ], [ %alloc, %coro.alloc ] 1217 %frame = call ptr @llvm.coro.begin(token %id, ptr %phi) 1218 1219.. _coro.noop: 1220 1221'llvm.coro.noop' Intrinsic 1222^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1223:: 1224 1225 declare ptr @llvm.coro.noop() 1226 1227Overview: 1228""""""""" 1229 1230The '``llvm.coro.noop``' intrinsic returns an address of the coroutine frame of 1231a coroutine that does nothing when resumed or destroyed. 1232 1233Arguments: 1234"""""""""" 1235 1236None 1237 1238Semantics: 1239"""""""""" 1240 1241This intrinsic is lowered to refer to a private constant coroutine frame. The 1242resume and destroy handlers for this frame are empty functions that do nothing. 1243Note that in different translation units llvm.coro.noop may return different pointers. 1244 1245.. _coro.frame: 1246 1247'llvm.coro.frame' Intrinsic 1248^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1249:: 1250 1251 declare ptr @llvm.coro.frame() 1252 1253Overview: 1254""""""""" 1255 1256The '``llvm.coro.frame``' intrinsic returns an address of the coroutine frame of 1257the enclosing coroutine. 1258 1259Arguments: 1260"""""""""" 1261 1262None 1263 1264Semantics: 1265"""""""""" 1266 1267This intrinsic is lowered to refer to the `coro.begin`_ instruction. This is 1268a frontend convenience intrinsic that makes it easier to refer to the 1269coroutine frame. 1270 1271.. _coro.id: 1272 1273'llvm.coro.id' Intrinsic 1274^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1275:: 1276 1277 declare token @llvm.coro.id(i32 <align>, ptr <promise>, ptr <coroaddr>, 1278 ptr <fnaddrs>) 1279 1280Overview: 1281""""""""" 1282 1283The '``llvm.coro.id``' intrinsic returns a token identifying a 1284switched-resume coroutine. 1285 1286Arguments: 1287"""""""""" 1288 1289The first argument provides information on the alignment of the memory returned 1290by the allocation function and given to `coro.begin` by the first argument. If 1291this argument is 0, the memory is assumed to be aligned to 2 * sizeof(ptr). 1292This argument only accepts constants. 1293 1294The second argument, if not `null`, designates a particular alloca instruction 1295to be a `coroutine promise`_. 1296 1297The third argument is `null` coming out of the frontend. The CoroEarly pass sets 1298this argument to point to the function this coro.id belongs to. 1299 1300The fourth argument is `null` before coroutine is split, and later is replaced 1301to point to a private global constant array containing function pointers to 1302outlined resume and destroy parts of the coroutine. 1303 1304 1305Semantics: 1306"""""""""" 1307 1308The purpose of this intrinsic is to tie together `coro.id`, `coro.alloc` and 1309`coro.begin` belonging to the same coroutine to prevent optimization passes from 1310duplicating any of these instructions unless entire body of the coroutine is 1311duplicated. 1312 1313A frontend should emit exactly one `coro.id` intrinsic per coroutine. 1314 1315A frontend should emit function attribute `presplitcoroutine` for the coroutine. 1316 1317.. _coro.id.async: 1318 1319'llvm.coro.id.async' Intrinsic 1320^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1321:: 1322 1323 declare token @llvm.coro.id.async(i32 <context size>, i32 <align>, 1324 ptr <context arg>, 1325 ptr <async function pointer>) 1326 1327Overview: 1328""""""""" 1329 1330The '``llvm.coro.id.async``' intrinsic returns a token identifying an async coroutine. 1331 1332Arguments: 1333"""""""""" 1334 1335The first argument provides the initial size of the `async context` as required 1336from the frontend. Lowering will add to this size the size required by the frame 1337storage and store that value to the `async function pointer`. 1338 1339The second argument, is the alignment guarantee of the memory of the 1340`async context`. The frontend guarantees that the memory will be aligned by this 1341value. 1342 1343The third argument is the `async context` argument in the current coroutine. 1344 1345The fourth argument is the address of the `async function pointer` struct. 1346Lowering will update the context size requirement in this struct by adding the 1347coroutine frame size requirement to the initial size requirement as specified by 1348the first argument of this intrinsic. 1349 1350 1351Semantics: 1352"""""""""" 1353 1354A frontend should emit exactly one `coro.id.async` intrinsic per coroutine. 1355 1356A frontend should emit function attribute `presplitcoroutine` for the coroutine. 1357 1358.. _coro.id.retcon: 1359 1360'llvm.coro.id.retcon' Intrinsic 1361^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1362:: 1363 1364 declare token @llvm.coro.id.retcon(i32 <size>, i32 <align>, ptr <buffer>, 1365 ptr <continuation prototype>, 1366 ptr <alloc>, ptr <dealloc>) 1367 1368Overview: 1369""""""""" 1370 1371The '``llvm.coro.id.retcon``' intrinsic returns a token identifying a 1372multiple-suspend returned-continuation coroutine. 1373 1374The 'result-type sequence' of the coroutine is defined as follows: 1375 1376- if the return type of the coroutine function is ``void``, it is the 1377 empty sequence; 1378 1379- if the return type of the coroutine function is a ``struct``, it is the 1380 element types of that ``struct`` in order; 1381 1382- otherwise, it is just the return type of the coroutine function. 1383 1384The first element of the result-type sequence must be a pointer type; 1385continuation functions will be coerced to this type. The rest of 1386the sequence are the 'yield types', and any suspends in the coroutine 1387must take arguments of these types. 1388 1389Arguments: 1390"""""""""" 1391 1392The first and second arguments are the expected size and alignment of 1393the buffer provided as the third argument. They must be constant. 1394 1395The fourth argument must be a reference to a global function, called 1396the 'continuation prototype function'. The type, calling convention, 1397and attributes of any continuation functions will be taken from this 1398declaration. The return type of the prototype function must match the 1399return type of the current function. The first parameter type must be 1400a pointer type. The second parameter type must be an integer type; 1401it will be used only as a boolean flag. 1402 1403The fifth argument must be a reference to a global function that will 1404be used to allocate memory. It may not fail, either by returning null 1405or throwing an exception. It must take an integer and return a pointer. 1406 1407The sixth argument must be a reference to a global function that will 1408be used to deallocate memory. It must take a pointer and return ``void``. 1409 1410Semantics: 1411"""""""""" 1412 1413A frontend should emit function attribute `presplitcoroutine` for the coroutine. 1414 1415'llvm.coro.id.retcon.once' Intrinsic 1416^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1417:: 1418 1419 declare token @llvm.coro.id.retcon.once(i32 <size>, i32 <align>, ptr <buffer>, 1420 ptr <prototype>, 1421 ptr <alloc>, ptr <dealloc>) 1422 1423Overview: 1424""""""""" 1425 1426The '``llvm.coro.id.retcon.once``' intrinsic returns a token identifying a 1427unique-suspend returned-continuation coroutine. 1428 1429Arguments: 1430"""""""""" 1431 1432As for ``llvm.core.id.retcon``, except that the return type of the 1433continuation prototype must represent the normal return type of the continuation 1434(instead of matching the coroutine's return type). 1435 1436Semantics: 1437"""""""""" 1438 1439A frontend should emit function attribute `presplitcoroutine` for the coroutine. 1440 1441.. _coro.end: 1442 1443'llvm.coro.end' Intrinsic 1444^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1445:: 1446 1447 declare i1 @llvm.coro.end(ptr <handle>, i1 <unwind>, token <result.token>) 1448 1449Overview: 1450""""""""" 1451 1452The '``llvm.coro.end``' marks the point where execution of the resume part of 1453the coroutine should end and control should return to the caller. 1454 1455 1456Arguments: 1457"""""""""" 1458 1459The first argument should refer to the coroutine handle of the enclosing 1460coroutine. A frontend is allowed to supply null as the first parameter, in this 1461case `coro-early` pass will replace the null with an appropriate coroutine 1462handle value. 1463 1464The second argument should be `true` if this coro.end is in the block that is 1465part of the unwind sequence leaving the coroutine body due to an exception and 1466`false` otherwise. 1467 1468Non-trivial (non-none) token argument can only be specified for unique-suspend 1469returned-continuation coroutines where it must be a token value produced by 1470'``llvm.coro.end.results``' intrinsic. 1471 1472Only none token is allowed for coro.end calls in unwind sections 1473 1474Semantics: 1475"""""""""" 1476The purpose of this intrinsic is to allow frontends to mark the cleanup and 1477other code that is only relevant during the initial invocation of the coroutine 1478and should not be present in resume and destroy parts. 1479 1480In returned-continuation lowering, ``llvm.coro.end`` fully destroys the 1481coroutine frame. If the second argument is `false`, it also returns from 1482the coroutine with a null continuation pointer, and the next instruction 1483will be unreachable. If the second argument is `true`, it falls through 1484so that the following logic can resume unwinding. In a yield-once 1485coroutine, reaching a non-unwind ``llvm.coro.end`` without having first 1486reached a ``llvm.coro.suspend.retcon`` has undefined behavior. 1487 1488The remainder of this section describes the behavior under switched-resume 1489lowering. 1490 1491This intrinsic is lowered when a coroutine is split into 1492the start, resume and destroy parts. In the start part, it is a no-op, 1493in resume and destroy parts, it is replaced with `ret void` instruction and 1494the rest of the block containing `coro.end` instruction is discarded. 1495In landing pads it is replaced with an appropriate instruction to unwind to 1496caller. The handling of coro.end differs depending on whether the target is 1497using landingpad or WinEH exception model. 1498 1499For landingpad based exception model, it is expected that frontend uses the 1500`coro.end`_ intrinsic as follows: 1501 1502.. code-block:: llvm 1503 1504 ehcleanup: 1505 %InResumePart = call i1 @llvm.coro.end(ptr null, i1 true, token none) 1506 br i1 %InResumePart, label %eh.resume, label %cleanup.cont 1507 1508 cleanup.cont: 1509 ; rest of the cleanup 1510 1511 eh.resume: 1512 %exn = load ptr, ptr %exn.slot, align 8 1513 %sel = load i32, ptr %ehselector.slot, align 4 1514 %lpad.val = insertvalue { ptr, i32 } undef, ptr %exn, 0 1515 %lpad.val29 = insertvalue { ptr, i32 } %lpad.val, i32 %sel, 1 1516 resume { ptr, i32 } %lpad.val29 1517 1518The `CoroSpit` pass replaces `coro.end` with ``True`` in the resume functions, 1519thus leading to immediate unwind to the caller, whereas in start function it 1520is replaced with ``False``, thus allowing to proceed to the rest of the cleanup 1521code that is only needed during initial invocation of the coroutine. 1522 1523For Windows Exception handling model, a frontend should attach a funclet bundle 1524referring to an enclosing cleanuppad as follows: 1525 1526.. code-block:: llvm 1527 1528 ehcleanup: 1529 %tok = cleanuppad within none [] 1530 %unused = call i1 @llvm.coro.end(ptr null, i1 true, token none) [ "funclet"(token %tok) ] 1531 cleanupret from %tok unwind label %RestOfTheCleanup 1532 1533The `CoroSplit` pass, if the funclet bundle is present, will insert 1534``cleanupret from %tok unwind to caller`` before 1535the `coro.end`_ intrinsic and will remove the rest of the block. 1536 1537In the unwind path (when the argument is `true`), `coro.end` will mark the coroutine 1538as done, making it undefined behavior to resume the coroutine again and causing 1539`llvm.coro.done` to return `true`. This is not necessary in the normal path because 1540the coroutine will already be marked as done by the final suspend. 1541 1542The following table summarizes the handling of `coro.end`_ intrinsic. 1543 1544+--------------------------+------------------------+---------------------------------+ 1545| | In Start Function | In Resume/Destroy Functions | 1546+--------------------------+------------------------+---------------------------------+ 1547|unwind=false | nothing |``ret void`` | 1548+------------+-------------+------------------------+---------------------------------+ 1549| | WinEH | mark coroutine as done || ``cleanupret unwind to caller``| 1550| | | || mark coroutine done | 1551|unwind=true +-------------+------------------------+---------------------------------+ 1552| | Landingpad | mark coroutine as done | mark coroutine done | 1553+------------+-------------+------------------------+---------------------------------+ 1554 1555.. _coro.end.results: 1556 1557'llvm.coro.end.results' Intrinsic 1558^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1559:: 1560 1561 declare token @llvm.coro.end.results(...) 1562 1563Overview: 1564""""""""" 1565 1566The '``llvm.coro.end.results``' intrinsic captures values to be returned from 1567unique-suspend returned-continuation coroutines. 1568 1569Arguments: 1570"""""""""" 1571 1572The number of arguments must match the return type of the continuation function: 1573 1574- if the return type of the continuation function is ``void`` there must be no 1575 arguments 1576 1577- if the return type of the continuation function is a ``struct``, the arguments 1578 will be of element types of that ``struct`` in order; 1579 1580- otherwise, it is just the return value of the continuation function. 1581 1582.. code-block:: llvm 1583 1584 define {ptr, ptr} @g(ptr %buffer, ptr %ptr, i8 %val) presplitcoroutine { 1585 entry: 1586 %id = call token @llvm.coro.id.retcon.once(i32 8, i32 8, ptr %buffer, 1587 ptr @prototype, 1588 ptr @allocate, ptr @deallocate) 1589 %hdl = call ptr @llvm.coro.begin(token %id, ptr null) 1590 1591 ... 1592 1593 cleanup: 1594 %tok = call token (...) @llvm.coro.end.results(i8 %val) 1595 call i1 @llvm.coro.end(ptr %hdl, i1 0, token %tok) 1596 unreachable 1597 1598 ... 1599 1600 declare i8 @prototype(ptr, i1 zeroext) 1601 1602 1603'llvm.coro.end.async' Intrinsic 1604^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1605:: 1606 1607 declare i1 @llvm.coro.end.async(ptr <handle>, i1 <unwind>, ...) 1608 1609Overview: 1610""""""""" 1611 1612The '``llvm.coro.end.async``' marks the point where execution of the resume part 1613of the coroutine should end and control should return to the caller. As part of 1614its variable tail arguments this instruction allows to specify a function and 1615the function's arguments that are to be tail called as the last action before 1616returning. 1617 1618 1619Arguments: 1620"""""""""" 1621 1622The first argument should refer to the coroutine handle of the enclosing 1623coroutine. A frontend is allowed to supply null as the first parameter, in this 1624case `coro-early` pass will replace the null with an appropriate coroutine 1625handle value. 1626 1627The second argument should be `true` if this coro.end is in the block that is 1628part of the unwind sequence leaving the coroutine body due to an exception and 1629`false` otherwise. 1630 1631The third argument if present should specify a function to be called. 1632 1633If the third argument is present, the remaining arguments are the arguments to 1634the function call. 1635 1636.. code-block:: llvm 1637 1638 call i1 (ptr, i1, ...) @llvm.coro.end.async( 1639 ptr %hdl, i1 0, 1640 ptr @must_tail_call_return, 1641 ptr %ctxt, ptr %task, ptr %actor) 1642 unreachable 1643 1644.. _coro.suspend: 1645.. _suspend points: 1646 1647'llvm.coro.suspend' Intrinsic 1648^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1649:: 1650 1651 declare i8 @llvm.coro.suspend(token <save>, i1 <final>) 1652 1653Overview: 1654""""""""" 1655 1656The '``llvm.coro.suspend``' marks the point where execution of a 1657switched-resume coroutine is suspended and control is returned back 1658to the caller. Conditional branches consuming the result of this 1659intrinsic lead to basic blocks where coroutine should proceed when 1660suspended (-1), resumed (0) or destroyed (1). 1661 1662Arguments: 1663"""""""""" 1664 1665The first argument refers to a token of `coro.save` intrinsic that marks the 1666point when coroutine state is prepared for suspension. If `none` token is passed, 1667the intrinsic behaves as if there were a `coro.save` immediately preceding 1668the `coro.suspend` intrinsic. 1669 1670The second argument indicates whether this suspension point is `final`_. 1671The second argument only accepts constants. If more than one suspend point is 1672designated as final, the resume and destroy branches should lead to the same 1673basic blocks. 1674 1675Example (normal suspend point): 1676""""""""""""""""""""""""""""""" 1677 1678.. code-block:: llvm 1679 1680 %0 = call i8 @llvm.coro.suspend(token none, i1 false) 1681 switch i8 %0, label %suspend [i8 0, label %resume 1682 i8 1, label %cleanup] 1683 1684Example (final suspend point): 1685"""""""""""""""""""""""""""""" 1686 1687.. code-block:: llvm 1688 1689 while.end: 1690 %s.final = call i8 @llvm.coro.suspend(token none, i1 true) 1691 switch i8 %s.final, label %suspend [i8 0, label %trap 1692 i8 1, label %cleanup] 1693 trap: 1694 call void @llvm.trap() 1695 unreachable 1696 1697Semantics: 1698"""""""""" 1699 1700If a coroutine that was suspended at the suspend point marked by this intrinsic 1701is resumed via `coro.resume`_ the control will transfer to the basic block 1702of the 0-case. If it is resumed via `coro.destroy`_, it will proceed to the 1703basic block indicated by the 1-case. To suspend, coroutine proceed to the 1704default label. 1705 1706If suspend intrinsic is marked as final, it can consider the `true` branch 1707unreachable and can perform optimizations that can take advantage of that fact. 1708 1709.. _coro.save: 1710 1711'llvm.coro.save' Intrinsic 1712^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1713:: 1714 1715 declare token @llvm.coro.save(ptr <handle>) 1716 1717Overview: 1718""""""""" 1719 1720The '``llvm.coro.save``' marks the point where a coroutine need to update its 1721state to prepare for resumption to be considered suspended (and thus eligible 1722for resumption). It is illegal to merge two '``llvm.coro.save``' calls unless their 1723'``llvm.coro.suspend``' users are also merged. So '``llvm.coro.save``' is currently 1724tagged with the `no_merge` function attribute. 1725 1726Arguments: 1727"""""""""" 1728 1729The first argument points to a coroutine handle of the enclosing coroutine. 1730 1731Semantics: 1732"""""""""" 1733 1734Whatever coroutine state changes are required to enable resumption of 1735the coroutine from the corresponding suspend point should be done at the point 1736of `coro.save` intrinsic. 1737 1738Example: 1739"""""""" 1740 1741Separate save and suspend points are necessary when a coroutine is used to 1742represent an asynchronous control flow driven by callbacks representing 1743completions of asynchronous operations. 1744 1745In such a case, a coroutine should be ready for resumption prior to a call to 1746`async_op` function that may trigger resumption of a coroutine from the same or 1747a different thread possibly prior to `async_op` call returning control back 1748to the coroutine: 1749 1750.. code-block:: llvm 1751 1752 %save1 = call token @llvm.coro.save(ptr %hdl) 1753 call void @async_op1(ptr %hdl) 1754 %suspend1 = call i1 @llvm.coro.suspend(token %save1, i1 false) 1755 switch i8 %suspend1, label %suspend [i8 0, label %resume1 1756 i8 1, label %cleanup] 1757 1758.. _coro.suspend.async: 1759 1760'llvm.coro.suspend.async' Intrinsic 1761^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1762:: 1763 1764 declare {ptr, ptr, ptr} @llvm.coro.suspend.async( 1765 ptr <resume function>, 1766 ptr <context projection function>, 1767 ... <function to call> 1768 ... <arguments to function>) 1769 1770Overview: 1771""""""""" 1772 1773The '``llvm.coro.suspend.async``' intrinsic marks the point where 1774execution of an async coroutine is suspended and control is passed to a callee. 1775 1776Arguments: 1777"""""""""" 1778 1779The first argument should be the result of the `llvm.coro.async.resume` intrinsic. 1780Lowering will replace this intrinsic with the resume function for this suspend 1781point. 1782 1783The second argument is the `context projection function`. It should describe 1784how-to restore the `async context` in the continuation function from the first 1785argument of the continuation function. Its type is `ptr (ptr)`. 1786 1787The third argument is the function that models transfer to the callee at the 1788suspend point. It should take 3 arguments. Lowering will `musttail` call this 1789function. 1790 1791The fourth to six argument are the arguments for the third argument. 1792 1793Semantics: 1794"""""""""" 1795 1796The result of the intrinsic are mapped to the arguments of the resume function. 1797Execution is suspended at this intrinsic and resumed when the resume function is 1798called. 1799 1800.. _coro.prepare.async: 1801 1802'llvm.coro.prepare.async' Intrinsic 1803^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1804:: 1805 1806 declare ptr @llvm.coro.prepare.async(ptr <coroutine function>) 1807 1808Overview: 1809""""""""" 1810 1811The '``llvm.coro.prepare.async``' intrinsic is used to block inlining of the 1812async coroutine until after coroutine splitting. 1813 1814Arguments: 1815"""""""""" 1816 1817The first argument should be an async coroutine of type `void (ptr, ptr, ptr)`. 1818Lowering will replace this intrinsic with its coroutine function argument. 1819 1820.. _coro.suspend.retcon: 1821 1822'llvm.coro.suspend.retcon' Intrinsic 1823^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1824:: 1825 1826 declare i1 @llvm.coro.suspend.retcon(...) 1827 1828Overview: 1829""""""""" 1830 1831The '``llvm.coro.suspend.retcon``' intrinsic marks the point where 1832execution of a returned-continuation coroutine is suspended and control 1833is returned back to the caller. 1834 1835`llvm.coro.suspend.retcon`` does not support separate save points; 1836they are not useful when the continuation function is not locally 1837accessible. That would be a more appropriate feature for a ``passcon`` 1838lowering that is not yet implemented. 1839 1840Arguments: 1841"""""""""" 1842 1843The types of the arguments must exactly match the yielded-types sequence 1844of the coroutine. They will be turned into return values from the ramp 1845and continuation functions, along with the next continuation function. 1846 1847Semantics: 1848"""""""""" 1849 1850The result of the intrinsic indicates whether the coroutine should resume 1851abnormally (non-zero). 1852 1853In a normal coroutine, it is undefined behavior if the coroutine executes 1854a call to ``llvm.coro.suspend.retcon`` after resuming abnormally. 1855 1856In a yield-once coroutine, it is undefined behavior if the coroutine 1857executes a call to ``llvm.coro.suspend.retcon`` after resuming in any way. 1858 1859.. _coro.await.suspend.void: 1860 1861'llvm.coro.await.suspend.void' Intrinsic 1862^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1863:: 1864 1865 declare void @llvm.coro.await.suspend.void( 1866 ptr <awaiter>, 1867 ptr <handle>, 1868 ptr <await_suspend_function>) 1869 1870Overview: 1871""""""""" 1872 1873The '``llvm.coro.await.suspend.void``' intrinsic encapsulates C++ 1874`await-suspend` block until it can't interfere with coroutine transform. 1875 1876The `await_suspend` block of `co_await` is essentially asynchronous 1877to the execution of the coroutine. Inlining it normally into an unsplit 1878coroutine can cause miscompilation because the coroutine CFG misrepresents 1879the true control flow of the program: things that happen in the 1880await_suspend are not guaranteed to happen prior to the resumption of the 1881coroutine, and things that happen after the resumption of the coroutine 1882(including its exit and the potential deallocation of the coroutine frame) 1883are not guaranteed to happen only after the end of `await_suspend`. 1884 1885This version of intrinsic corresponds to 1886'``void awaiter.await_suspend(...)``' variant. 1887 1888Arguments: 1889"""""""""" 1890 1891The first argument is a pointer to `awaiter` object. 1892 1893The second argument is a pointer to the current coroutine's frame. 1894 1895The third argument is a pointer to the wrapper function encapsulating 1896`await-suspend` logic. Its signature must be 1897 1898.. code-block:: llvm 1899 1900 declare void @await_suspend_function(ptr %awaiter, ptr %hdl) 1901 1902Semantics: 1903"""""""""" 1904 1905The intrinsic must be used between corresponding `coro.save`_ and 1906`coro.suspend`_ calls. It is lowered to a direct 1907`await_suspend_function` call during `CoroSplit`_ pass. 1908 1909Example: 1910"""""""" 1911 1912.. code-block:: llvm 1913 1914 ; before lowering 1915 await.suspend: 1916 %save = call token @llvm.coro.save(ptr %hdl) 1917 call void @llvm.coro.await.suspend.void( 1918 ptr %awaiter, 1919 ptr %hdl, 1920 ptr @await_suspend_function) 1921 %suspend = call i8 @llvm.coro.suspend(token %save, i1 false) 1922 ... 1923 1924 ; after lowering 1925 await.suspend: 1926 %save = call token @llvm.coro.save(ptr %hdl) 1927 ; the call to await_suspend_function can be inlined 1928 call void @await_suspend_function( 1929 ptr %awaiter, 1930 ptr %hdl) 1931 %suspend = call i8 @llvm.coro.suspend(token %save, i1 false) 1932 ... 1933 1934 ; wrapper function example 1935 define void @await_suspend_function(ptr %awaiter, ptr %hdl) 1936 entry: 1937 %hdl.arg = ... ; construct std::coroutine_handle from %hdl 1938 call void @"Awaiter::await_suspend"(ptr %awaiter, ptr %hdl.arg) 1939 ret void 1940 1941.. _coro.await.suspend.bool: 1942 1943'llvm.coro.await.suspend.bool' Intrinsic 1944^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1945:: 1946 1947 declare i1 @llvm.coro.await.suspend.bool( 1948 ptr <awaiter>, 1949 ptr <handle>, 1950 ptr <await_suspend_function>) 1951 1952Overview: 1953""""""""" 1954 1955The '``llvm.coro.await.suspend.bool``' intrinsic encapsulates C++ 1956`await-suspend` block until it can't interfere with coroutine transform. 1957 1958The `await_suspend` block of `co_await` is essentially asynchronous 1959to the execution of the coroutine. Inlining it normally into an unsplit 1960coroutine can cause miscompilation because the coroutine CFG misrepresents 1961the true control flow of the program: things that happen in the 1962await_suspend are not guaranteed to happen prior to the resumption of the 1963coroutine, and things that happen after the resumption of the coroutine 1964(including its exit and the potential deallocation of the coroutine frame) 1965are not guaranteed to happen only after the end of `await_suspend`. 1966 1967This version of intrinsic corresponds to 1968'``bool awaiter.await_suspend(...)``' variant. 1969 1970Arguments: 1971"""""""""" 1972 1973The first argument is a pointer to `awaiter` object. 1974 1975The second argument is a pointer to the current coroutine's frame. 1976 1977The third argument is a pointer to the wrapper function encapsulating 1978`await-suspend` logic. Its signature must be 1979 1980.. code-block:: llvm 1981 1982 declare i1 @await_suspend_function(ptr %awaiter, ptr %hdl) 1983 1984Semantics: 1985"""""""""" 1986 1987The intrinsic must be used between corresponding `coro.save`_ and 1988`coro.suspend`_ calls. It is lowered to a direct 1989`await_suspend_function` call during `CoroSplit`_ pass. 1990 1991If `await_suspend_function` call returns `true`, the current coroutine is 1992immediately resumed. 1993 1994Example: 1995"""""""" 1996 1997.. code-block:: llvm 1998 1999 ; before lowering 2000 await.suspend: 2001 %save = call token @llvm.coro.save(ptr %hdl) 2002 %resume = call i1 @llvm.coro.await.suspend.bool( 2003 ptr %awaiter, 2004 ptr %hdl, 2005 ptr @await_suspend_function) 2006 br i1 %resume, %await.suspend.bool, %await.ready 2007 await.suspend.bool: 2008 %suspend = call i8 @llvm.coro.suspend(token %save, i1 false) 2009 ... 2010 await.ready: 2011 call void @"Awaiter::await_resume"(ptr %awaiter) 2012 ... 2013 2014 ; after lowering 2015 await.suspend: 2016 %save = call token @llvm.coro.save(ptr %hdl) 2017 ; the call to await_suspend_function can inlined 2018 %resume = call i1 @await_suspend_function( 2019 ptr %awaiter, 2020 ptr %hdl) 2021 br i1 %resume, %await.suspend.bool, %await.ready 2022 ... 2023 2024 ; wrapper function example 2025 define i1 @await_suspend_function(ptr %awaiter, ptr %hdl) 2026 entry: 2027 %hdl.arg = ... ; construct std::coroutine_handle from %hdl 2028 %resume = call i1 @"Awaiter::await_suspend"(ptr %awaiter, ptr %hdl.arg) 2029 ret i1 %resume 2030 2031.. _coro.await.suspend.handle: 2032 2033'llvm.coro.await.suspend.handle' Intrinsic 2034^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2035:: 2036 2037 declare void @llvm.coro.await.suspend.handle( 2038 ptr <awaiter>, 2039 ptr <handle>, 2040 ptr <await_suspend_function>) 2041 2042Overview: 2043""""""""" 2044 2045The '``llvm.coro.await.suspend.handle``' intrinsic encapsulates C++ 2046`await-suspend` block until it can't interfere with coroutine transform. 2047 2048The `await_suspend` block of `co_await` is essentially asynchronous 2049to the execution of the coroutine. Inlining it normally into an unsplit 2050coroutine can cause miscompilation because the coroutine CFG misrepresents 2051the true control flow of the program: things that happen in the 2052await_suspend are not guaranteed to happen prior to the resumption of the 2053coroutine, and things that happen after the resumption of the coroutine 2054(including its exit and the potential deallocation of the coroutine frame) 2055are not guaranteed to happen only after the end of `await_suspend`. 2056 2057This version of intrinsic corresponds to 2058'``std::corouine_handle<> awaiter.await_suspend(...)``' variant. 2059 2060Arguments: 2061"""""""""" 2062 2063The first argument is a pointer to `awaiter` object. 2064 2065The second argument is a pointer to the current coroutine's frame. 2066 2067The third argument is a pointer to the wrapper function encapsulating 2068`await-suspend` logic. Its signature must be 2069 2070.. code-block:: llvm 2071 2072 declare ptr @await_suspend_function(ptr %awaiter, ptr %hdl) 2073 2074Semantics: 2075"""""""""" 2076 2077The intrinsic must be used between corresponding `coro.save`_ and 2078`coro.suspend`_ calls. It is lowered to a direct 2079`await_suspend_function` call during `CoroSplit`_ pass. 2080 2081`await_suspend_function` must return a pointer to a valid 2082coroutine frame. The intrinsic will be lowered to a tail call resuming the 2083returned coroutine frame. It will be marked `musttail` on targets that support 2084that. Instructions following the intrinsic will become unreachable. 2085 2086Example: 2087"""""""" 2088 2089.. code-block:: llvm 2090 2091 ; before lowering 2092 await.suspend: 2093 %save = call token @llvm.coro.save(ptr %hdl) 2094 call void @llvm.coro.await.suspend.handle( 2095 ptr %awaiter, 2096 ptr %hdl, 2097 ptr @await_suspend_function) 2098 %suspend = call i8 @llvm.coro.suspend(token %save, i1 false) 2099 ... 2100 2101 ; after lowering 2102 await.suspend: 2103 %save = call token @llvm.coro.save(ptr %hdl) 2104 ; the call to await_suspend_function can be inlined 2105 %next = call ptr @await_suspend_function( 2106 ptr %awaiter, 2107 ptr %hdl) 2108 musttail call void @llvm.coro.resume(%next) 2109 ret void 2110 ... 2111 2112 ; wrapper function example 2113 define ptr @await_suspend_function(ptr %awaiter, ptr %hdl) 2114 entry: 2115 %hdl.arg = ... ; construct std::coroutine_handle from %hdl 2116 %hdl.raw = call ptr @"Awaiter::await_suspend"(ptr %awaiter, ptr %hdl.arg) 2117 %hdl.result = ... ; get address of returned coroutine handle 2118 ret ptr %hdl.result 2119 2120Coroutine Transformation Passes 2121=============================== 2122CoroEarly 2123--------- 2124The pass CoroEarly lowers coroutine intrinsics that hide the details of the 2125structure of the coroutine frame, but, otherwise not needed to be preserved to 2126help later coroutine passes. This pass lowers `coro.frame`_, `coro.done`_, 2127and `coro.promise`_ intrinsics. 2128 2129.. _CoroSplit: 2130 2131CoroSplit 2132--------- 2133The pass CoroSplit builds coroutine frame and outlines resume and destroy parts 2134into separate functions. This pass also lowers `coro.await.suspend.void`_, 2135`coro.await.suspend.bool`_ and `coro.await.suspend.handle`_ intrinsics. 2136 2137CoroAnnotationElide 2138------------------- 2139This pass finds all usages of coroutines that are "must elide" and replaces 2140`coro.begin` intrinsic with an address of a coroutine frame placed on its caller 2141and replaces `coro.alloc` and `coro.free` intrinsics with `false` and `null` 2142respectively to remove the deallocation code. 2143 2144CoroElide 2145--------- 2146The pass CoroElide examines if the inlined coroutine is eligible for heap 2147allocation elision optimization. If so, it replaces 2148`coro.begin` intrinsic with an address of a coroutine frame placed on its caller 2149and replaces `coro.alloc` and `coro.free` intrinsics with `false` and `null` 2150respectively to remove the deallocation code. 2151This pass also replaces `coro.resume` and `coro.destroy` intrinsics with direct 2152calls to resume and destroy functions for a particular coroutine where possible. 2153 2154CoroCleanup 2155----------- 2156This pass runs late to lower all coroutine related intrinsics not replaced by 2157earlier passes. 2158 2159Attributes 2160========== 2161 2162coro_only_destroy_when_complete 2163------------------------------- 2164 2165When the coroutine are marked with coro_only_destroy_when_complete, it indicates 2166the coroutine must reach the final suspend point when it get destroyed. 2167 2168This attribute only works for switched-resume coroutines now. 2169 2170coro_elide_safe 2171--------------- 2172 2173When a Call or Invoke instruction to switch ABI coroutine `f` is marked with 2174`coro_elide_safe`, CoroSplitPass generates a `f.noalloc` ramp function. 2175`f.noalloc` has one more argument than its original ramp function `f`, which is 2176the pointer to the allocated frame. `f.noalloc` also suppressed any allocations 2177or deallocations that may be guarded by `@llvm.coro.alloc` and `@llvm.coro.free`. 2178 2179CoroAnnotationElidePass performs the heap elision when possible. Note that for 2180recursive or mutually recursive functions this elision is usually not possible. 2181 2182Metadata 2183======== 2184 2185'``coro.outside.frame``' Metadata 2186--------------------------------- 2187 2188``coro.outside.frame`` metadata may be attached to an alloca instruction to 2189to signify that it shouldn't be promoted to the coroutine frame, useful for 2190filtering allocas out by the frontend when emitting internal control mechanisms. 2191Additionally, this metadata is only used as a flag, so the associated 2192node must be empty. 2193 2194.. code-block:: text 2195 2196 %__coro_gro = alloca %struct.GroType, align 1, !coro.outside.frame !0 2197 2198 ... 2199 !0 = !{} 2200 2201Areas Requiring Attention 2202========================= 2203#. When coro.suspend returns -1, the coroutine is suspended, and it's possible 2204 that the coroutine has already been destroyed (hence the frame has been freed). 2205 We cannot access anything on the frame on the suspend path. 2206 However there is nothing that prevents the compiler from moving instructions 2207 along that path (e.g. LICM), which can lead to use-after-free. At the moment 2208 we disabled LICM for loops that have coro.suspend, but the general problem still 2209 exists and requires a general solution. 2210 2211#. Take advantage of the lifetime intrinsics for the data that goes into the 2212 coroutine frame. Leave lifetime intrinsics as is for the data that stays in 2213 allocas. 2214 2215#. The CoroElide optimization pass relies on coroutine ramp function to be 2216 inlined. It would be beneficial to split the ramp function further to 2217 increase the chance that it will get inlined into its caller. 2218 2219#. Design a convention that would make it possible to apply coroutine heap 2220 elision optimization across ABI boundaries. 2221 2222#. Cannot handle coroutines with `inalloca` parameters (used in x86 on Windows). 2223 2224#. Alignment is ignored by coro.begin and coro.free intrinsics. 2225 2226#. Make required changes to make sure that coroutine optimizations work with 2227 LTO. 2228 2229#. More tests, more tests, more tests 2230