1.. SPDX-License-Identifier: BSD-3-Clause 2 Copyright(c) 2019 Arm Limited. 3 4.. _RCU_Library: 5 6RCU Library 7============ 8 9Lockless data structures provide scalability and determinism. 10They enable use cases where locking may not be allowed 11(for example real-time applications). 12 13In the following sections, the term "memory" refers to memory allocated 14by typical APIs like malloc() or anything that is representative of 15memory, for example an index of a free element array. 16 17Since these data structures are lockless, the writers and readers 18are accessing the data structures concurrently. Hence, while removing 19an element from a data structure, the writers cannot return the memory 20to the allocator, without knowing that the readers are not 21referencing that element/memory anymore. Hence, it is required to 22separate the operation of removing an element into two steps: 23 24#. Delete: in this step, the writer removes the reference to the element from 25 the data structure but does not return the associated memory to the 26 allocator. This will ensure that new readers will not get a reference to 27 the removed element. Removing the reference is an atomic operation. 28 29#. Free (Reclaim): in this step, the writer returns the memory to the 30 memory allocator only after knowing that all the readers have stopped 31 referencing the deleted element. 32 33This library helps the writer determine when it is safe to free the 34memory by making use of thread Quiescent State (QS). 35 36What is Quiescent State 37----------------------- 38 39Quiescent State can be defined as "any point in the thread execution where the 40thread does not hold a reference to shared memory". It is up to the application 41to determine its quiescent state. 42 43Let us consider the following diagram: 44 45.. _figure_quiescent_state: 46 47.. figure:: img/rcu_general_info.* 48 49 Phases in the Quiescent State model. 50 51 52As shown in :numref:`figure_quiescent_state`, reader thread 1 accesses data 53structures D1 and D2. When it is accessing D1, if the writer has to remove an 54element from D1, the writer cannot free the memory associated with that 55element immediately. The writer can return the memory to the allocator only 56after the reader stops referencing D1. In other words, reader thread RT1 has 57to enter a quiescent state. 58 59Similarly, since reader thread 2 is also accessing D1, the writer has to 60wait till thread 2 enters quiescent state as well. 61 62However, the writer does not need to wait for reader thread 3 to enter 63quiescent state. Reader thread 3 was not accessing D1 when the delete 64operation happened. So, reader thread 1 will not have a reference to the 65deleted entry. 66 67It can be noted that, the critical sections for D2 is a quiescent state 68for D1. i.e. for a given data structure Dx, any point in the thread execution 69that does not reference Dx is a quiescent state. 70 71Since memory is not freed immediately, there might be a need for 72provisioning of additional memory, depending on the application requirements. 73 74Factors affecting the RCU mechanism 75----------------------------------- 76 77It is important to make sure that this library keeps the overhead of 78identifying the end of grace period and subsequent freeing of memory, 79to a minimum. The following explains how grace period and critical 80section affect this overhead. 81 82The writer has to poll the readers to identify the end of grace period. 83Polling introduces memory accesses and wastes CPU cycles. The memory 84is not available for reuse during the grace period. Longer grace periods 85exasperate these conditions. 86 87The length of the critical section and the number of reader threads 88is proportional to the duration of the grace period. Keeping the critical 89sections smaller will keep the grace period smaller. However, keeping the 90critical sections smaller requires additional CPU cycles (due to additional 91reporting) in the readers. 92 93Hence, we need the characteristics of a small grace period and large critical 94section. This library addresses this by allowing the writer to do 95other work without having to block until the readers report their quiescent 96state. 97 98RCU in DPDK 99----------- 100 101For DPDK applications, the start and end of a ``while(1)`` loop (where no 102references to shared data structures are kept) act as perfect quiescent 103states. This will combine all the shared data structure accesses into a 104single, large critical section which helps keep the overhead on the 105reader side to a minimum. 106 107DPDK supports a pipeline model of packet processing and service cores. 108In these use cases, a given data structure may not be used by all the 109workers in the application. The writer does not have to wait for all 110the workers to report their quiescent state. To provide the required 111flexibility, this library has a concept of a QS variable. The application 112can create one QS variable per data structure to help it track the 113end of grace period for each data structure. This helps keep the grace 114period to a minimum. 115 116How to use this library 117----------------------- 118 119The application must allocate memory and initialize a QS variable. 120 121Applications can call ``rte_rcu_qsbr_get_memsize()`` to calculate the size 122of memory to allocate. This API takes a maximum number of reader threads, 123using this variable, as a parameter. Currently, a maximum of 1024 threads 124are supported. 125 126Further, the application can initialize a QS variable using the API 127``rte_rcu_qsbr_init()``. 128 129Each reader thread is assumed to have a unique thread ID. Currently, the 130management of the thread ID (for example allocation/free) is left to the 131application. The thread ID should be in the range of 0 to 132maximum number of threads provided while creating the QS variable. 133The application could also use ``lcore_id`` as the thread ID where applicable. 134 135The ``rte_rcu_qsbr_thread_register()`` API will register a reader thread 136to report its quiescent state. This can be called from a reader thread. 137A control plane thread can also call this on behalf of a reader thread. 138The reader thread must call ``rte_rcu_qsbr_thread_online()`` API to start 139reporting its quiescent state. 140 141Some of the use cases might require the reader threads to make blocking API 142calls (for example while using eventdev APIs). The writer thread should not 143wait for such reader threads to enter quiescent state. The reader thread must 144call ``rte_rcu_qsbr_thread_offline()`` API, before calling blocking APIs. It 145can call ``rte_rcu_qsbr_thread_online()`` API once the blocking API call 146returns. 147 148The writer thread can trigger the reader threads to report their quiescent 149state by calling the API ``rte_rcu_qsbr_start()``. It is possible for multiple 150writer threads to query the quiescent state status simultaneously. Hence, 151``rte_rcu_qsbr_start()`` returns a token to each caller. 152 153The writer thread must call ``rte_rcu_qsbr_check()`` API with the token to 154get the current quiescent state status. Option to block till all the reader 155threads enter the quiescent state is provided. If this API indicates that 156all the reader threads have entered the quiescent state, the application 157can free the deleted entry. 158 159The APIs ``rte_rcu_qsbr_start()`` and ``rte_rcu_qsbr_check()`` are lock free. 160Hence, they can be called concurrently from multiple writers even while 161running as worker threads. 162 163The separation of triggering the reporting from querying the status provides 164the writer threads flexibility to do useful work instead of blocking for the 165reader threads to enter the quiescent state or go offline. This reduces the 166memory accesses due to continuous polling for the status. 167 168The ``rte_rcu_qsbr_synchronize()`` API combines the functionality of 169``rte_rcu_qsbr_start()`` and blocking ``rte_rcu_qsbr_check()`` into a single 170API. This API triggers the reader threads to report their quiescent state and 171polls till all the readers enter the quiescent state or go offline. This API 172does not allow the writer to do useful work while waiting and introduces 173additional memory accesses due to continuous polling. 174 175The reader thread must call ``rte_rcu_qsbr_thread_offline()`` and 176``rte_rcu_qsbr_thread_unregister()`` APIs to remove itself from reporting its 177quiescent state. The ``rte_rcu_qsbr_check()`` API will not wait for this reader 178thread to report the quiescent state status anymore. 179 180The reader threads should call ``rte_rcu_qsbr_quiescent()`` API to indicate that 181they entered a quiescent state. This API checks if a writer has triggered a 182quiescent state query and update the state accordingly. 183 184The ``rte_rcu_qsbr_lock()`` and ``rte_rcu_qsbr_unlock()`` are empty functions. 185However, when ``CONFIG_RTE_LIBRTE_RCU_DEBUG`` is enabled, these APIs aid 186in debugging issues. One can mark the access to shared data structures on the 187reader side using these APIs. The ``rte_rcu_qsbr_quiescent()`` will check if 188all the locks are unlocked. 189