1.. SPDX-License-Identifier: BSD-3-Clause 2 Copyright(c) 2023 Ericsson AB. 3 4Dispatcher Library 5================== 6 7Overview 8-------- 9 10The purpose of the dispatcher is to help reduce coupling in an 11:doc:`Eventdev <eventdev>`-based DPDK application. 12 13In particular, the dispatcher addresses a scenario where an 14application's modules share the same event device and event device 15ports, and performs work on the same lcore threads. 16 17The dispatcher replaces the conditional logic that follows an event 18device dequeue operation, where events are dispatched to different 19parts of the application, typically based on fields in the 20``rte_event``, such as the ``queue_id``, ``sub_event_type``, or 21``sched_type``. 22 23Below is an excerpt from a fictitious application consisting of two 24modules; A and B. In this example, event-to-module routing is based 25purely on queue id, where module A expects all events to a certain 26queue id, and module B two other queue ids. 27 28.. note:: 29 30 Event routing may reasonably be done based on other ``rte_event`` 31 fields (or even event user data). Indeed, that's the very reason to 32 have match callback functions, instead of a simple queue 33 id-to-handler mapping scheme. Queue id-based routing serves well in 34 a simple example. 35 36.. code-block:: c 37 38 for (;;) { 39 struct rte_event events[MAX_BURST]; 40 unsigned int n; 41 42 n = rte_event_dequeue_burst(dev_id, port_id, events, 43 MAX_BURST, 0); 44 45 for (i = 0; i < n; i++) { 46 const struct rte_event *event = &events[i]; 47 48 switch (event->queue_id) { 49 case MODULE_A_QUEUE_ID: 50 module_a_process(event); 51 break; 52 case MODULE_B_STAGE_0_QUEUE_ID: 53 module_b_process_stage_0(event); 54 break; 55 case MODULE_B_STAGE_1_QUEUE_ID: 56 module_b_process_stage_1(event); 57 break; 58 } 59 } 60 } 61 62The issue this example attempts to illustrate is that the centralized 63conditional logic has knowledge of things that should be private to 64the modules. In other words, this pattern leads to a violation of 65module encapsulation. 66 67The shared conditional logic contains explicit knowledge about what 68events should go where. In case, for example, the 69``module_a_process()`` is broken into two processing stages — a 70module-internal affair — the shared conditional code must be updated 71to reflect this change. 72 73The centralized event routing code becomes an issue in larger 74applications, where modules are developed by different organizations. 75This pattern also makes module reuse across different applications more 76difficult. The part of the conditional logic relevant for a particular 77application may need to be duplicated across many module 78instantiations (e.g., applications and test setups). 79 80The dispatcher separates the mechanism (routing events to their 81receiver) from the policy (which events should go where). 82 83The basic operation of the dispatcher is as follows: 84 85* Dequeue a batch of events from the event device. 86* For each event determine which handler should receive the event, using 87 a set of application-provided, per-handler event matching callback 88 functions. 89* Provide events matching a particular handler, to that handler, using 90 its process callback. 91 92If the above application would have made use of the dispatcher, the 93code relevant for its module A may have looked something like this: 94 95.. code-block:: c 96 97 static bool 98 module_a_match(const struct rte_event *event, void *cb_data) 99 { 100 return event->queue_id == MODULE_A_QUEUE_ID; 101 } 102 103 static void 104 module_a_process_events(uint8_t event_dev_id, uint8_t event_port_id, 105 const struct rte_event *events, 106 uint16_t num, void *cb_data) 107 { 108 uint16_t i; 109 110 for (i = 0; i < num; i++) 111 module_a_process_event(&events[i]); 112 } 113 114 /* In the module's initialization code */ 115 rte_dispatcher_register(dispatcher, module_a_match, NULL, 116 module_a_process_events, module_a_data); 117 118.. note:: 119 120 Error handling is left out of this and future example code in this chapter. 121 122When the shared conditional logic is removed, a new question arises: 123which part of the system actually runs the dispatching mechanism? Or 124phrased differently, what is replacing the function hosting the shared 125conditional logic (typically launched on all lcores using 126``rte_eal_remote_launch()``)? To solve this issue, the dispatcher is 127run as a DPDK :doc:`Service <../service_cores>`. 128 129The dispatcher is a layer between the application and the event device 130in the receive direction. In the transmit (i.e., item of work 131submission) direction, the application directly accesses the Eventdev 132core API (e.g., ``rte_event_enqueue_burst()``) to submit new or 133forwarded events to the event device. 134 135Dispatcher Creation 136------------------- 137 138A dispatcher is created using the ``rte_dispatcher_create()`` function. 139 140The event device must be configured before the dispatcher is created. 141 142Usually, only one dispatcher is needed per event device. A dispatcher 143handles exactly one event device. 144 145A dispatcher is freed using the ``rte_dispatcher_free()`` function. 146The dispatcher's service functions must not be running on 147any lcore at the point of this call. 148 149Event Port Binding 150------------------ 151 152To be able to dequeue events, the dispatcher must know which event 153ports are to be used, on all the lcores it uses. The application 154provides this information using 155``rte_dispatcher_bind_port_to_lcore()``. 156 157This call is typically made from the part of the application that 158deals with deployment issues (e.g., iterating lcores and determining 159which lcore does what), at the time of application initialization. 160 161The ``rte_dispatcher_unbind_port_from_lcore()`` is used to undo 162this operation. 163 164Multiple lcore threads may not safely use the same event 165port. 166 167.. note:: 168 169 This property (which is a feature, not a bug) is inherited from the 170 core Eventdev APIs. 171 172Event ports cannot safely be bound or unbound while the dispatcher's 173service function is running on any lcore. 174 175Event Handlers 176-------------- 177 178The dispatcher handler is an interface between the dispatcher and an 179application module, used to route events to the appropriate part of 180the application. 181 182Handler Registration 183^^^^^^^^^^^^^^^^^^^^ 184 185The event handler interface consists of two function pointers: 186 187* The ``rte_dispatcher_match_t`` callback, which job is to 188 decide if this event is to be the property of this handler. 189* The ``rte_dispatcher_process_t``, which is used by the 190 dispatcher to deliver matched events. 191 192An event handler registration is valid on all lcores. 193 194The functions pointed to by the match and process callbacks resides in 195the application's domain logic, with one or more handlers per 196application module. 197 198A module may use more than one event handler, for convenience or to 199further decouple sub-modules. However, the dispatcher may impose an 200upper limit of the number of handlers. In addition, installing a large 201number of handlers increase dispatcher overhead, although this does 202not necessarily translate to a system-level performance degradation. See 203the section on :ref:`Event Clustering` for more information. 204 205Handler registration and unregistration cannot safely be done while 206the dispatcher's service function is running on any lcore. 207 208Event Matching 209^^^^^^^^^^^^^^ 210 211A handler's match callback function decides if an event should be 212delivered to this handler, or not. 213 214An event is routed to no more than one handler. Thus, if a match 215function returns true, no further match functions will be invoked for 216that event. 217 218Match functions must not depend on being invocated in any particular 219order (e.g., in the handler registration order). 220 221Events failing to match any handler are dropped, and the 222``ev_drop_count`` counter is updated accordingly. 223 224Event Delivery 225^^^^^^^^^^^^^^ 226 227The handler callbacks are invocated by the dispatcher's service 228function, upon the arrival of events to the event ports bound to the 229running service lcore. 230 231A particular event is delivered to at most one handler. 232 233The application must not depend on all match callback invocations for 234a particular event batch being made prior to any process calls are 235being made. For example, if the dispatcher dequeues two events from 236the event device, it may choose to find out the destination for the 237first event, and deliver it, and then continue to find out the 238destination for the second, and then deliver that event as well. The 239dispatcher may also choose a strategy where no event is delivered 240until the destination handler for both events have been determined. 241 242The events provided in a single process call always belong to the same 243event port dequeue burst. 244 245.. _Event Clustering: 246 247Event Clustering 248^^^^^^^^^^^^^^^^ 249 250The dispatcher maintains the order of events destined for the same 251handler. 252 253*Order* here refers to the order in which the events were delivered 254from the event device to the dispatcher (i.e., in the event array 255populated by ``rte_event_dequeue_burst()``), in relation to the order 256in which the dispatcher delivers these events to the application. 257 258The dispatcher *does not* guarantee to maintain the order of events 259delivered to *different* handlers. 260 261For example, assume that ``MODULE_A_QUEUE_ID`` expands to the value 0, 262and ``MODULE_B_STAGE_0_QUEUE_ID`` expands to the value 1. Then 263consider a scenario where the following events are dequeued from the 264event device (qid is short for event queue id). 265 266.. code-block:: none 267 268 [e0: qid=1], [e1: qid=1], [e2: qid=0], [e3: qid=1] 269 270The dispatcher may deliver the events in the following manner: 271 272.. code-block:: none 273 274 module_b_stage_0_process([e0: qid=1], [e1: qid=1]) 275 module_a_process([e2: qid=0]) 276 module_b_stage_0_process([e2: qid=1]) 277 278The dispatcher may also choose to cluster (group) all events destined 279for ``module_b_stage_0_process()`` into one array: 280 281.. code-block:: none 282 283 module_b_stage_0_process([e0: qid=1], [e1: qid=1], [e3: qid=1]) 284 module_a_process([e2: qid=0]) 285 286Here, the event ``e2`` is reordered and placed behind ``e3``, from a 287delivery order point of view. This kind of reshuffling is allowed, 288since the events are destined for different handlers. 289 290The dispatcher may also deliver ``e2`` before the three events 291destined for module B. 292 293An example of what the dispatcher may not do, is to reorder event 294``e1`` so, that it precedes ``e0`` in the array passed to the module 295B's stage 0 process callback. 296 297Although clustering requires some extra work for the dispatcher, it 298leads to fewer process function calls. In addition, and likely more 299importantly, it improves temporal locality of memory accesses to 300handler-specific data structures in the application, which in turn may 301lead to fewer cache misses and improved overall performance. 302 303Finalize 304-------- 305 306The dispatcher may be configured to notify one or more parts of the 307application when the matching and processing of a batch of events has 308completed. 309 310The ``rte_dispatcher_finalize_register`` call is used to 311register a finalize callback. The function 312``rte_dispatcher_finalize_unregister`` is used to remove a 313callback. 314 315The finalize hook may be used by a set of event handlers (in the same 316modules, or a set of cooperating modules) sharing an event output 317buffer, since it allows for flushing of the buffers at the last 318possible moment. In particular, it allows for buffering of 319``RTE_EVENT_OP_FORWARD`` events, which must be flushed before the next 320``rte_event_dequeue_burst()`` call is made (assuming implicit release 321is employed). 322 323The following is an example with an application-defined event output 324buffer (the ``event_buffer``): 325 326.. code-block:: c 327 328 static void 329 finalize_batch(uint8_t event_dev_id, uint8_t event_port_id, 330 void *cb_data) 331 { 332 struct event_buffer *buffer = cb_data; 333 unsigned lcore_id = rte_lcore_id(); 334 struct event_buffer_lcore *lcore_buffer = 335 &buffer->lcore_buffer[lcore_id]; 336 337 event_buffer_lcore_flush(lcore_buffer); 338 } 339 340 /* In the module's initialization code */ 341 rte_dispatcher_finalize_register(dispatcher, finalize_batch, 342 shared_event_buffer); 343 344The dispatcher does not track any relationship between a handler and a 345finalize callback, and all finalize callbacks will be called, if (and 346only if) at least one event was dequeued from the event device. 347 348Finalize callback registration and unregistration cannot safely be 349done while the dispatcher's service function is running on any lcore. 350 351Service 352------- 353 354The dispatcher is a DPDK service, and is managed in a manner similar 355to other DPDK services (e.g., an Event Timer Adapter). 356 357Below is an example of how to configure a particular lcore to serve as 358a service lcore, and to map an already-configured dispatcher 359(identified by ``DISPATCHER_ID``) to that lcore. 360 361.. code-block:: c 362 363 static void 364 launch_dispatcher_core(struct rte_dispatcher *dispatcher, 365 unsigned lcore_id) 366 { 367 uint32_t service_id; 368 369 rte_service_lcore_add(lcore_id); 370 371 rte_dispatcher_service_id_get(dispatcher, &service_id); 372 373 rte_service_map_lcore_set(service_id, lcore_id, 1); 374 375 rte_service_lcore_start(lcore_id); 376 377 rte_service_runstate_set(service_id, 1); 378 } 379 380As the final step, the dispatcher must be started. 381 382.. code-block:: c 383 384 rte_dispatcher_start(dispatcher); 385 386 387Multi Service Dispatcher Lcores 388^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 389 390In an Eventdev application, most (or all) compute-intensive and 391performance-sensitive processing is done in an event-driven manner, 392where CPU cycles spent on application domain logic is the direct 393result of items of work (i.e., ``rte_event`` events) dequeued from an 394event device. 395 396In the light of this, it makes sense to have the dispatcher service be 397the only DPDK service on all lcores used for packet processing — at 398least in principle. 399 400However, there is nothing in DPDK that prevents colocating other 401services with the dispatcher service on the same lcore. 402 403Tasks that prior to the introduction of the dispatcher into the 404application was performed on the lcore, even though no events were 405received, are prime targets for being converted into such auxiliary 406services, running on the dispatcher core set. 407 408An example of such a task would be the management of a per-lcore timer 409wheel (i.e., calling ``rte_timer_manage()``). 410 411Applications employing :doc:`../rcu_lib` (or 412similar technique) may opt for having quiescent state (e.g., calling 413``rte_rcu_qsbr_quiescent()``) signaling factored out into a separate 414service, to assure resource reclaiming occurs even though some 415lcores currently do not process any events. 416 417If more services than the dispatcher service is mapped to a service 418lcore, it's important that the other service are well-behaved and 419don't interfere with event processing to the extent the system's 420throughput and/or latency requirements are at risk of not being met. 421 422In particular, to avoid jitter, they should have a small upper bound 423for the maximum amount of time spent in a single service function 424call. 425 426An example of scenario with a more CPU-heavy colocated service is a 427low-lcore count deployment, where the event device lacks the 428``RTE_EVENT_ETH_RX_ADAPTER_CAP_INTERNAL_PORT`` capability (and thus 429requires software to feed incoming packets into the event device). In 430this case, the best performance may be achieved if the Event Ethernet 431RX and/or TX Adapters are mapped to lcores also used for event 432dispatching, since otherwise the adapter lcores would have a lot of 433idle CPU cycles. 434