1.. SPDX-License-Identifier: BSD-3-Clause 2 Copyright(c) 2017 Intel Corporation. 3 Copyright(c) 2018 Arm Limited. 4 5Event Device Library 6==================== 7 8The DPDK Event device library is an abstraction that provides the application 9with features to schedule events. This is achieved using the PMD architecture 10similar to the ethdev or cryptodev APIs, which may already be familiar to the 11reader. 12 13The eventdev framework introduces the event driven programming model. In a 14polling model, lcores poll ethdev ports and associated Rx queues directly 15to look for a packet. By contrast in an event driven model, lcores call the 16scheduler that selects packets for them based on programmer-specified criteria. 17The Eventdev library adds support for an event driven programming model, which 18offers applications automatic multicore scaling, dynamic load balancing, 19pipelining, packet ingress order maintenance and synchronization services to 20simplify application packet processing. 21 22By introducing an event driven programming model, DPDK can support both polling 23and event driven programming models for packet processing, and applications are 24free to choose whatever model (or combination of the two) best suits their 25needs. 26 27Step-by-step instructions of the eventdev design is available in the `API 28Walk-through`_ section later in this document. 29 30Event struct 31------------ 32 33The eventdev API represents each event with a generic struct, which contains a 34payload and metadata required for scheduling by an eventdev. The 35``rte_event`` struct is a 16 byte C structure, defined in 36``libs/librte_eventdev/rte_eventdev.h``. 37 38Event Metadata 39~~~~~~~~~~~~~~ 40 41The rte_event structure contains the following metadata fields, which the 42application fills in to have the event scheduled as required: 43 44* ``flow_id`` - The targeted flow identifier for the enq/deq operation. 45* ``event_type`` - The source of this event, e.g. RTE_EVENT_TYPE_ETHDEV or CPU. 46* ``sub_event_type`` - Distinguishes events inside the application, that have 47 the same event_type (see above) 48* ``op`` - This field takes one of the RTE_EVENT_OP_* values, and tells the 49 eventdev about the status of the event - valid values are NEW, FORWARD or 50 RELEASE. 51* ``sched_type`` - Represents the type of scheduling that should be performed 52 on this event, valid values are the RTE_SCHED_TYPE_ORDERED, ATOMIC and 53 PARALLEL. 54* ``queue_id`` - The identifier for the event queue that the event is sent to. 55* ``priority`` - The priority of this event, see RTE_EVENT_DEV_PRIORITY. 56 57Event Payload 58~~~~~~~~~~~~~ 59 60The rte_event struct contains a union for payload, allowing flexibility in what 61the actual event being scheduled is. The payload is a union of the following: 62 63* ``uint64_t u64`` 64* ``void *event_ptr`` 65* ``struct rte_mbuf *mbuf`` 66* ``struct rte_event_vector *vec`` 67 68These four items in a union occupy the same 64 bits at the end of the rte_event 69structure. The application can utilize the 64 bits directly by accessing the 70u64 variable, while the event_ptr, mbuf, vec are provided as a convenience 71variables. For example the mbuf pointer in the union can used to schedule a 72DPDK packet. 73 74Event Vector 75~~~~~~~~~~~~ 76 77The rte_event_vector struct contains a vector of elements defined by the event 78type specified in the ``rte_event``. The event_vector structure contains the 79following data: 80 81* ``nb_elem`` - The number of elements held within the vector. 82 83Similar to ``rte_event`` the payload of event vector is also a union, allowing 84flexibility in what the actual vector is. 85 86* ``struct rte_mbuf *mbufs[0]`` - An array of mbufs. 87* ``void *ptrs[0]`` - An array of pointers. 88* ``uint64_t u64s[0]`` - An array of uint64_t elements. 89 90The size of the event vector is related to the total number of elements it is 91configured to hold, this is achieved by making `rte_event_vector` a variable 92length structure. 93A helper function is provided to create a mempool that holds event vector, which 94takes name of the pool, total number of required ``rte_event_vector``, 95cache size, number of elements in each ``rte_event_vector`` and socket id. 96 97.. code-block:: c 98 99 rte_event_vector_pool_create("vector_pool", nb_event_vectors, cache_sz, 100 nb_elements_per_vector, socket_id); 101 102The function ``rte_event_vector_pool_create`` creates mempool with the best 103platform mempool ops. 104 105Queues 106~~~~~~ 107 108An event queue is a queue containing events that are scheduled by the event 109device. An event queue contains events of different flows associated with 110scheduling types, such as atomic, ordered, or parallel. 111 112Queue All Types Capable 113^^^^^^^^^^^^^^^^^^^^^^^ 114 115If RTE_EVENT_DEV_CAP_QUEUE_ALL_TYPES capability bit is set in the event device, 116then events of any type may be sent to any queue. Otherwise, the queues only 117support events of the type that it was created with. 118 119Queue All Types Incapable 120^^^^^^^^^^^^^^^^^^^^^^^^^ 121 122In this case, each stage has a specified scheduling type. The application 123configures each queue for a specific type of scheduling, and just enqueues all 124events to the eventdev. An example of a PMD of this type is the eventdev 125software PMD. 126 127The Eventdev API supports the following scheduling types per queue: 128 129* Atomic 130* Ordered 131* Parallel 132 133Atomic, Ordered and Parallel are load-balanced scheduling types: the output 134of the queue can be spread out over multiple CPU cores. 135 136Atomic scheduling on a queue ensures that a single flow is not present on two 137different CPU cores at the same time. Ordered allows sending all flows to any 138core, but the scheduler must ensure that on egress the packets are returned to 139ingress order on downstream queue enqueue. Parallel allows sending all flows 140to all CPU cores, without any re-ordering guarantees. 141 142Single Link Flag 143^^^^^^^^^^^^^^^^ 144 145There is a SINGLE_LINK flag which allows an application to indicate that only 146one port will be connected to a queue. Queues configured with the single-link 147flag follow a FIFO like structure, maintaining ordering but it is only capable 148of being linked to a single port (see below for port and queue linking details). 149 150 151Ports 152~~~~~ 153 154Ports are the points of contact between worker cores and the eventdev. The 155general use case will see one CPU core using one port to enqueue and dequeue 156events from an eventdev. Ports are linked to queues in order to retrieve events 157from those queues (more details in `Linking Queues and Ports`_ below). 158 159 160API Walk-through 161---------------- 162 163This section will introduce the reader to the eventdev API, showing how to 164create and configure an eventdev and use it for a two-stage atomic pipeline 165with one core each for RX and TX. RX and TX cores are shown here for 166illustration, refer to Eventdev Adapter documentation for further details. 167The diagram below shows the final state of the application after this 168walk-through: 169 170.. _figure_eventdev-usage1: 171 172.. figure:: ../img/eventdev_usage.* 173 174 Sample eventdev usage, with RX, two atomic stages and a single-link to TX. 175 176 177A high level overview of the setup steps are: 178 179* rte_event_dev_configure() 180* rte_event_queue_setup() 181* rte_event_port_setup() 182* rte_event_port_link() 183* rte_event_dev_start() 184 185 186Init and Config 187~~~~~~~~~~~~~~~ 188 189The eventdev library uses vdev options to add devices to the DPDK application. 190The ``--vdev`` EAL option allows adding eventdev instances to your DPDK 191application, using the name of the eventdev PMD as an argument. 192 193For example, to create an instance of the software eventdev scheduler, the 194following vdev arguments should be provided to the application EAL command line: 195 196.. code-block:: console 197 198 ./dpdk_application --vdev="event_sw0" 199 200In the following code, we configure eventdev instance with 3 queues 201and 6 ports as follows. The 3 queues consist of 2 Atomic and 1 Single-Link, 202while the 6 ports consist of 4 workers, 1 RX and 1 TX. 203 204.. code-block:: c 205 206 const struct rte_event_dev_config config = { 207 .nb_event_queues = 3, 208 .nb_event_ports = 6, 209 .nb_events_limit = 4096, 210 .nb_event_queue_flows = 1024, 211 .nb_event_port_dequeue_depth = 128, 212 .nb_event_port_enqueue_depth = 128, 213 }; 214 int err = rte_event_dev_configure(dev_id, &config); 215 216The remainder of this walk-through assumes that dev_id is 0. 217 218Setting up Queues 219~~~~~~~~~~~~~~~~~ 220 221Once the eventdev itself is configured, the next step is to configure queues. 222This is done by setting the appropriate values in a queue_conf structure, and 223calling the setup function. Repeat this step for each queue, starting from 2240 and ending at ``nb_event_queues - 1`` from the event_dev config above. 225 226.. code-block:: c 227 228 struct rte_event_queue_conf atomic_conf = { 229 .schedule_type = RTE_SCHED_TYPE_ATOMIC, 230 .priority = RTE_EVENT_DEV_PRIORITY_NORMAL, 231 .nb_atomic_flows = 1024, 232 .nb_atomic_order_sequences = 1024, 233 }; 234 struct rte_event_queue_conf single_link_conf = { 235 .event_queue_cfg = RTE_EVENT_QUEUE_CFG_SINGLE_LINK, 236 }; 237 int dev_id = 0; 238 int atomic_q_1 = 0; 239 int atomic_q_2 = 1; 240 int single_link_q = 2; 241 int err = rte_event_queue_setup(dev_id, atomic_q_1, &atomic_conf); 242 int err = rte_event_queue_setup(dev_id, atomic_q_2, &atomic_conf); 243 int err = rte_event_queue_setup(dev_id, single_link_q, &single_link_conf); 244 245As shown above, queue IDs are as follows: 246 247 * id 0, atomic queue #1 248 * id 1, atomic queue #2 249 * id 2, single-link queue 250 251These queues are used for the remainder of this walk-through. 252 253Setting up Ports 254~~~~~~~~~~~~~~~~ 255 256Once queues are set up successfully, create the ports as required. 257 258.. code-block:: c 259 260 struct rte_event_port_conf rx_conf = { 261 .dequeue_depth = 128, 262 .enqueue_depth = 128, 263 .new_event_threshold = 1024, 264 }; 265 struct rte_event_port_conf worker_conf = { 266 .dequeue_depth = 16, 267 .enqueue_depth = 64, 268 .new_event_threshold = 4096, 269 }; 270 struct rte_event_port_conf tx_conf = { 271 .dequeue_depth = 128, 272 .enqueue_depth = 128, 273 .new_event_threshold = 4096, 274 }; 275 int dev_id = 0; 276 int rx_port_id = 0; 277 int worker_port_id; 278 int err = rte_event_port_setup(dev_id, rx_port_id, &rx_conf); 279 280 for (worker_port_id = 1; worker_port_id <= 4; worker_port_id++) { 281 int err = rte_event_port_setup(dev_id, worker_port_id, &worker_conf); 282 } 283 284 int tx_port_id = 5; 285 int err = rte_event_port_setup(dev_id, tx_port_id, &tx_conf); 286 287As shown above: 288 289 * port 0: RX core 290 * ports 1,2,3,4: Workers 291 * port 5: TX core 292 293These ports are used for the remainder of this walk-through. 294 295Linking Queues and Ports 296~~~~~~~~~~~~~~~~~~~~~~~~ 297 298The final step is to "wire up" the ports to the queues. After this, the 299eventdev is capable of scheduling events, and when cores request work to do, 300the correct events are provided to that core. Note that the RX core takes input 301from e.g.: a NIC so it is not linked to any eventdev queues. 302 303Linking all workers to atomic queues, and the TX core to the single-link queue 304can be achieved like this: 305 306.. code-block:: c 307 308 uint8_t rx_port_id = 0; 309 uint8_t tx_port_id = 5; 310 uint8_t atomic_qs[] = {0, 1}; 311 uint8_t single_link_q = 2; 312 uint8_t priority = RTE_EVENT_DEV_PRIORITY_NORMAL; 313 int worker_port_id; 314 315 for (worker_port_id = 1; worker_port_id <= 4; worker_port_id++) { 316 int links_made = rte_event_port_link(dev_id, worker_port_id, atomic_qs, NULL, 2); 317 } 318 int links_made = rte_event_port_link(dev_id, tx_port_id, &single_link_q, &priority, 1); 319 320Linking Queues to Ports with link profiles 321~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 322 323An application can use link profiles if supported by the underlying event device to setup up 324multiple link profile per port and change them run time depending up on heuristic data. 325Using Link profiles can reduce the overhead of linking/unlinking and wait for unlinks in progress 326in fast-path and gives applications the ability to switch between preset profiles on the fly. 327 328An example use case could be as follows. 329 330Config path: 331 332.. code-block:: c 333 334 uint8_t lq[4] = {4, 5, 6, 7}; 335 uint8_t hq[4] = {0, 1, 2, 3}; 336 337 if (rte_event_dev_info.max_profiles_per_port < 2) 338 return -ENOTSUP; 339 340 rte_event_port_profile_links_set(0, 0, hq, NULL, 4, 0); 341 rte_event_port_profile_links_set(0, 0, lq, NULL, 4, 1); 342 343Worker path: 344 345.. code-block:: c 346 347 uint8_t profile_id_to_switch; 348 349 while (1) { 350 deq = rte_event_dequeue_burst(0, 0, &ev, 1, 0); 351 if (deq == 0) { 352 profile_id_to_switch = app_find_profile_id_to_switch(); 353 rte_event_port_profile_switch(0, 0, profile_id_to_switch); 354 continue; 355 } 356 357 // Process the event received. 358 } 359 360Starting the EventDev 361~~~~~~~~~~~~~~~~~~~~~ 362 363A single function call tells the eventdev instance to start processing 364events. Note that all queues must be linked to for the instance to start, as 365if any queue is not linked to, enqueuing to that queue will cause the 366application to backpressure and eventually stall due to no space in the 367eventdev. 368 369.. code-block:: c 370 371 int err = rte_event_dev_start(dev_id); 372 373.. Note:: 374 375 EventDev needs to be started before starting the event producers such 376 as event_eth_rx_adapter, event_timer_adapter, event_crypto_adapter and 377 event_dma_adapter. 378 379Ingress of New Events 380~~~~~~~~~~~~~~~~~~~~~ 381 382Now that the eventdev is set up, and ready to receive events, the RX core must 383enqueue some events into the system for it to schedule. The events to be 384scheduled are ordinary DPDK packets, received from an eth_rx_burst() as normal. 385The following code shows how those packets can be enqueued into the eventdev: 386 387.. code-block:: c 388 389 const uint16_t nb_rx = rte_eth_rx_burst(eth_port, 0, mbufs, BATCH_SIZE); 390 391 for (i = 0; i < nb_rx; i++) { 392 ev[i].flow_id = mbufs[i]->hash.rss; 393 ev[i].op = RTE_EVENT_OP_NEW; 394 ev[i].sched_type = RTE_SCHED_TYPE_ATOMIC; 395 ev[i].queue_id = atomic_q_1; 396 ev[i].event_type = RTE_EVENT_TYPE_ETHDEV; 397 ev[i].sub_event_type = 0; 398 ev[i].priority = RTE_EVENT_DEV_PRIORITY_NORMAL; 399 ev[i].mbuf = mbufs[i]; 400 } 401 402 const int nb_tx = rte_event_enqueue_burst(dev_id, rx_port_id, ev, nb_rx); 403 if (nb_tx != nb_rx) { 404 for(i = nb_tx; i < nb_rx; i++) 405 rte_pktmbuf_free(mbufs[i]); 406 } 407 408Forwarding of Events 409~~~~~~~~~~~~~~~~~~~~ 410 411Now that the RX core has injected events, there is work to be done by the 412workers. Note that each worker will dequeue as many events as it can in a burst, 413process each one individually, and then burst the packets back into the 414eventdev. 415 416The worker can lookup the events source from ``event.queue_id``, which should 417indicate to the worker what workload needs to be performed on the event. 418Once done, the worker can update the ``event.queue_id`` to a new value, to send 419the event to the next stage in the pipeline. 420 421.. code-block:: c 422 423 int timeout = 0; 424 struct rte_event events[BATCH_SIZE]; 425 uint16_t nb_rx = rte_event_dequeue_burst(dev_id, worker_port_id, events, BATCH_SIZE, timeout); 426 427 for (i = 0; i < nb_rx; i++) { 428 /* process mbuf using events[i].queue_id as pipeline stage */ 429 struct rte_mbuf *mbuf = events[i].mbuf; 430 /* Send event to next stage in pipeline */ 431 events[i].queue_id++; 432 } 433 434 uint16_t nb_tx = rte_event_enqueue_burst(dev_id, worker_port_id, events, nb_rx); 435 436 437Egress of Events 438~~~~~~~~~~~~~~~~ 439 440Finally, when the packet is ready for egress or needs to be dropped, we need 441to inform the eventdev that the packet is no longer being handled by the 442application. This can be done by calling dequeue() or dequeue_burst(), which 443indicates that the previous burst of packets is no longer in use by the 444application. 445 446An event driven worker thread has following typical workflow on fastpath: 447 448.. code-block:: c 449 450 while (1) { 451 rte_event_dequeue_burst(...); 452 (event processing) 453 rte_event_enqueue_burst(...); 454 } 455 456Quiescing Event Ports 457~~~~~~~~~~~~~~~~~~~~~ 458 459To migrate the event port to another lcore 460or while tearing down a worker core using an event port, 461``rte_event_port_quiesce()`` can be invoked to make sure that all the data 462associated with the event port are released from the worker core, 463this might also include any prefetched events. 464 465A flush callback can be passed to the function to handle any outstanding events. 466 467.. code-block:: c 468 469 rte_event_port_quiesce(dev_id, port_id, release_cb, NULL); 470 471.. Note:: 472 473 Invocation of this API does not affect the existing port configuration. 474 475Stopping the EventDev 476~~~~~~~~~~~~~~~~~~~~~ 477 478A single function call tells the eventdev instance to stop processing events. 479A flush callback can be registered to free any inflight events 480using ``rte_event_dev_stop_flush_callback_register()`` function. 481 482.. code-block:: c 483 484 int err = rte_event_dev_stop(dev_id); 485 486.. Note:: 487 488 The event producers such as ``event_eth_rx_adapter``, 489 ``event_timer_adapter``, ``event_crypto_adapter`` and 490 ``event_dma_adapter`` need to be stopped before stopping 491 the event device. 492 493Summary 494------- 495 496The eventdev library allows an application to easily schedule events as it 497requires, either using a run-to-completion or pipeline processing model. The 498queues and ports abstract the logical functionality of an eventdev, providing 499the application with a generic method to schedule events. With the flexible 500PMD infrastructure applications benefit of improvements in existing eventdevs 501and additions of new ones without modification. 502