1c15af452STomasz Zawadzki# Scheduler {#scheduler} 2c15af452STomasz Zawadzki 3c15af452STomasz ZawadzkiSPDK's event/application framework (`lib/event`) now supports scheduling of 4c15af452STomasz Zawadzkilightweight threads. Schedulers are provided as plugins, called 5c15af452STomasz Zawadzkiimplementations. A default implementation is provided, but users may wish to 6c15af452STomasz Zawadzkiwrite their own scheduler to integrate into broader code frameworks or meet 7c15af452STomasz Zawadzkitheir performance needs. 8c15af452STomasz Zawadzki 9c15af452STomasz ZawadzkiThis feature should be considered experimental and is disabled by default. When 10c15af452STomasz Zawadzkienabled, the scheduler framework gathers data for each spdk thread and reactor 11c15af452STomasz Zawadzkiand passes it to a scheduler implementation to perform one of the following 12c15af452STomasz Zawadzkiactions. 13c15af452STomasz Zawadzki 14c15af452STomasz Zawadzki## Actions 15c15af452STomasz Zawadzki 16c15af452STomasz Zawadzki### Move a thread 17c15af452STomasz Zawadzki 18c15af452STomasz Zawadzki`spdk_thread`s can be moved to another reactor. Schedulers can examine the 19c15af452STomasz Zawadzkisuggested cpu_mask value for each lightweight thread to see if the user has 20c15af452STomasz Zawadzkirequested specific reactors, or choose a reactor using whatever algorithm they 21c15af452STomasz Zawadzkideem fit. 22c15af452STomasz Zawadzki 23c15af452STomasz Zawadzki### Switch reactor mode 24c15af452STomasz Zawadzki 25c15af452STomasz ZawadzkiReactors by default run in a mode that constantly polls for new actions for the 26c15af452STomasz Zawadzkimost efficient processing. Schedulers can switch a reactor into a mode that 27c15af452STomasz Zawadzkiinstead waits for an event on a file descriptor. On Linux, this is implemented 28c15af452STomasz Zawadzkiusing epoll. This results in reduced CPU usage but may be less responsive when 29c15af452STomasz Zawadzkievents occur. A reactor cannot enter this mode if any `spdk_threads` are 30c15af452STomasz Zawadzkicurrently scheduled to it. This limitation is expected to be lifted in the 31c15af452STomasz Zawadzkifuture, allowing `spdk_threads` to enter interrupt mode. 32c15af452STomasz Zawadzki 33c15af452STomasz Zawadzki### Set frequency of CPU core 34c15af452STomasz Zawadzki 35c15af452STomasz ZawadzkiThe frequency of CPU cores can be modified by the scheduler in response to 36c15af452STomasz Zawadzkiload. Only CPU cores that match the application cpu_mask may be modified. The 37c15af452STomasz Zawadzkimechanism for controlling CPU frequency is pluggable and the default provided 38c15af452STomasz Zawadzkiimplementation is called `dpdk_governor`, based on the `rte_power` library from 39c15af452STomasz ZawadzkiDPDK. 40c15af452STomasz Zawadzki 41c15af452STomasz Zawadzki#### Known limitation 42c15af452STomasz Zawadzki 43c15af452STomasz ZawadzkiWhen SMT (Hyperthreading) is enabled the two logical CPU cores sharing a single 44c15af452STomasz Zawadzkiphysical CPU core must run at the same frequency. If one of two of such logical 45c15af452STomasz ZawadzkiCPU cores is outside the application cpu_mask, the policy and frequency on that 46c15af452STomasz Zawadzkicore has to be managed by the administrator. 47c15af452STomasz Zawadzki 48c15af452STomasz Zawadzki## Scheduler implementations 49c15af452STomasz Zawadzki 50c15af452STomasz ZawadzkiThe scheduler in use may be controlled by JSON-RPC. Please use the 51835494b5SKrzysztof Karas[framework_set_scheduler](jsonrpc.html#rpc_framework_set_scheduler) RPC to 52835494b5SKrzysztof Karasswitch between schedulers or change their options. Currently only dynamic 53835494b5SKrzysztof Karasscheduler supports changing its parameters. 54c15af452STomasz Zawadzki 55835494b5SKrzysztof Karas[spdk_top](spdk_top.html#spdk_top) is a useful tool to observe the behavior of 56c15af452STomasz Zawadzkischedulers in different scenarios and workloads. 57c15af452STomasz Zawadzki 58c15af452STomasz Zawadzki### static [default] 59c15af452STomasz Zawadzki 60c15af452STomasz ZawadzkiThe `static` scheduler is the default scheduler and does no dynamic scheduling. 61c15af452STomasz ZawadzkiLightweight threads are distributed round-robin among reactors, respecting 62*bdf42664SKrzysztof Karastheir requested cpu_mask, only at application startup, and then they are never 63*bdf42664SKrzysztof Karasmoved. This is equivalent to the previous behavior of the SPDK event/application 64*bdf42664SKrzysztof Karasframework. 65*bdf42664SKrzysztof Karas 66*bdf42664SKrzysztof KarasThe `static` scheduler cannot be re-enabled after a different scheduler has been 67*bdf42664SKrzysztof Karasselected, because currently there is no way to save original SPDK thread distribution 68*bdf42664SKrzysztof Karasconfiguration. 69c15af452STomasz Zawadzki 70c15af452STomasz Zawadzki### dynamic 71c15af452STomasz Zawadzki 72c15af452STomasz ZawadzkiThe `dynamic` scheduler is designed for power saving and reduction of CPU 73c15af452STomasz Zawadzkiutilization, especially in cases where workloads show large variations over 74835494b5SKrzysztof Karastime. In SPDK thread and core workloads are measured in CPU ticks. Those 75835494b5SKrzysztof Karasvalues are then compared with all the ticks since the last check, which allows 76835494b5SKrzysztof Karasto calculate `busy time`. 77835494b5SKrzysztof Karas 78835494b5SKrzysztof Karas`busy time = busy ticks / (busy tick + idle tick) * 100 %` 79835494b5SKrzysztof Karas 80835494b5SKrzysztof KarasThe thread is considered to be active, if its busy time is over the `load limit` 81835494b5SKrzysztof Karasparameter. 82c15af452STomasz Zawadzki 83c15af452STomasz ZawadzkiActive threads are distributed equally among reactors, taking cpu_mask into 84c15af452STomasz Zawadzkiaccount. All idle threads are moved to the main core. Once an idle thread becomes 85835494b5SKrzysztof Karasactive, it is redistributed again. Dynamic scheduler monitors core workloads and 86835494b5SKrzysztof Karasredistributes SPDK threads on cores in a way that none of them is over `core limit`. 87835494b5SKrzysztof KarasIn case a core utilization surpasses this threshold, scheduler should move threads 88835494b5SKrzysztof Karasout of it until this condition no longer applies. Cores might also be in overloaded 89835494b5SKrzysztof Karasstate, which indicates that moving threads out of this core will not decrease its 90835494b5SKrzysztof Karasutilization under the `core limit` and the threads are unable to process all the I/O 91835494b5SKrzysztof Karasthey are capable of, because they share CPU ticks with other threads. The threshold 92835494b5SKrzysztof Karasto decide if a core is overloaded is called `core busy`. Note that threads residing 93835494b5SKrzysztof Karason an overloaded core will not perform as good as other threads, because the CPU ticks 94835494b5SKrzysztof Karasintended for them are limited by other threads on the same core. 95c15af452STomasz Zawadzki 96c15af452STomasz ZawadzkiWhen a reactor has no scheduled `spdk_thread`s it is switched into interrupt 97c15af452STomasz Zawadzkimode and stops actively polling. After enough threads become active, the 98c15af452STomasz Zawadzkireactor is switched back into poll mode and threads are assigned to it again. 99c15af452STomasz Zawadzki 100c15af452STomasz ZawadzkiThe main core can contain active threads only when their execution time does 101c15af452STomasz Zawadzkinot exceed the sum of all idle threads. When no active threads are present on 102c15af452STomasz Zawadzkithe main core, the frequency of that CPU core will decrease as the load 103c15af452STomasz Zawadzkidecreases. All CPU cores corresponding to the other reactors remain at maximum 104c15af452STomasz Zawadzkifrequency. 105835494b5SKrzysztof Karas 106835494b5SKrzysztof KarasThe dynamic scheduler is currently the only one that allows manual setting of 107835494b5SKrzysztof Karasits parameters. 108835494b5SKrzysztof Karas 109835494b5SKrzysztof KarasCurrent values of scheduler parameters can be displayed by using 110835494b5SKrzysztof Karas[framework_get_scheduler](jsonrpc.html#rpc_framework_get_scheduler) RPC. 111