xref: /spdk/doc/scheduler.md (revision bdf42664d13c892cc1b23bd0dac109b69a63bb05)
1# Scheduler {#scheduler}
2
3SPDK's event/application framework (`lib/event`) now supports scheduling of
4lightweight threads. Schedulers are provided as plugins, called
5implementations. A default implementation is provided, but users may wish to
6write their own scheduler to integrate into broader code frameworks or meet
7their performance needs.
8
9This feature should be considered experimental and is disabled by default. When
10enabled, the scheduler framework gathers data for each spdk thread and reactor
11and passes it to a scheduler implementation to perform one of the following
12actions.
13
14## Actions
15
16### Move a thread
17
18`spdk_thread`s can be moved to another reactor. Schedulers can examine the
19suggested cpu_mask value for each lightweight thread to see if the user has
20requested specific reactors, or choose a reactor using whatever algorithm they
21deem fit.
22
23### Switch reactor mode
24
25Reactors by default run in a mode that constantly polls for new actions for the
26most efficient processing. Schedulers can switch a reactor into a mode that
27instead waits for an event on a file descriptor. On Linux, this is implemented
28using epoll. This results in reduced CPU usage but may be less responsive when
29events occur. A reactor cannot enter this mode if any `spdk_threads` are
30currently scheduled to it. This limitation is expected to be lifted in the
31future, allowing `spdk_threads` to enter interrupt mode.
32
33### Set frequency of CPU core
34
35The frequency of CPU cores can be modified by the scheduler in response to
36load. Only CPU cores that match the application cpu_mask may be modified. The
37mechanism for controlling CPU frequency is pluggable and the default provided
38implementation is called `dpdk_governor`, based on the `rte_power` library from
39DPDK.
40
41#### Known limitation
42
43When SMT (Hyperthreading) is enabled the two logical CPU cores sharing a single
44physical CPU core must run at the same frequency. If one of two of such logical
45CPU cores is outside the application cpu_mask, the policy and frequency on that
46core has to be managed by the administrator.
47
48## Scheduler implementations
49
50The scheduler in use may be controlled by JSON-RPC. Please use the
51[framework_set_scheduler](jsonrpc.html#rpc_framework_set_scheduler) RPC to
52switch between schedulers or change their options. Currently only dynamic
53scheduler supports changing its parameters.
54
55[spdk_top](spdk_top.html#spdk_top) is a useful tool to observe the behavior of
56schedulers in different scenarios and workloads.
57
58### static [default]
59
60The `static` scheduler is the default scheduler and does no dynamic scheduling.
61Lightweight threads are distributed round-robin among reactors, respecting
62their requested cpu_mask, only at application startup, and then they are never
63moved. This is equivalent to the previous behavior of the SPDK event/application
64framework.
65
66The `static` scheduler cannot be re-enabled after a different scheduler has been
67selected, because currently there is no way to save original SPDK thread distribution
68configuration.
69
70### dynamic
71
72The `dynamic` scheduler is designed for power saving and reduction of CPU
73utilization, especially in cases where workloads show large variations over
74time. In SPDK thread and core workloads are measured in CPU ticks. Those
75values are then compared with all the ticks since the last check, which allows
76to calculate `busy time`.
77
78`busy time = busy ticks / (busy tick + idle tick) * 100 %`
79
80The thread is considered to be active, if its busy time is over the `load limit`
81parameter.
82
83Active threads are distributed equally among reactors, taking cpu_mask into
84account. All idle threads are moved to the main core. Once an idle thread becomes
85active, it is redistributed again. Dynamic scheduler monitors core workloads and
86redistributes SPDK threads on cores in a way that none of them is over `core limit`.
87In case a core utilization surpasses this threshold, scheduler should move threads
88out of it until this condition no longer applies. Cores might also be in overloaded
89state, which indicates that moving threads out of this core will not decrease its
90utilization under the `core limit` and the threads are unable to process all the I/O
91they are capable of, because they share CPU ticks with other threads. The threshold
92to decide if a core is overloaded is called `core busy`. Note that threads residing
93on an overloaded core will not perform as good as other threads, because the CPU ticks
94intended for them are limited by other threads on the same core.
95
96When a reactor has no scheduled `spdk_thread`s it is switched into interrupt
97mode and stops actively polling. After enough threads become active, the
98reactor is switched back into poll mode and threads are assigned to it again.
99
100The main core can contain active threads only when their execution time does
101not exceed the sum of all idle threads. When no active threads are present on
102the main core, the frequency of that CPU core will decrease as the load
103decreases. All CPU cores corresponding to the other reactors remain at maximum
104frequency.
105
106The dynamic scheduler is currently the only one that allows manual setting of
107its parameters.
108
109Current values of scheduler parameters can be displayed by using
110[framework_get_scheduler](jsonrpc.html#rpc_framework_get_scheduler) RPC.
111