xref: /dpdk/doc/guides/prog_guide/eventdev/eventdev.rst (revision 79e8689b9d53b3feca235a6e4b661cb98f1432b1)
1..  SPDX-License-Identifier: BSD-3-Clause
2    Copyright(c) 2017 Intel Corporation.
3    Copyright(c) 2018 Arm Limited.
4
5Event Device Library
6====================
7
8The DPDK Event device library is an abstraction that provides the application
9with features to schedule events. This is achieved using the PMD architecture
10similar to the ethdev or cryptodev APIs, which may already be familiar to the
11reader.
12
13The eventdev framework introduces the event driven programming model. In a
14polling model, lcores poll ethdev ports and associated Rx queues directly
15to look for a packet. By contrast in an event driven model, lcores call the
16scheduler that selects packets for them based on programmer-specified criteria.
17The Eventdev library adds support for an event driven programming model, which
18offers applications automatic multicore scaling, dynamic load balancing,
19pipelining, packet ingress order maintenance and synchronization services to
20simplify application packet processing.
21
22By introducing an event driven programming model, DPDK can support both polling
23and event driven programming models for packet processing, and applications are
24free to choose whatever model (or combination of the two) best suits their
25needs.
26
27Step-by-step instructions of the eventdev design is available in the `API
28Walk-through`_ section later in this document.
29
30Event struct
31------------
32
33The eventdev API represents each event with a generic struct, which contains a
34payload and metadata required for scheduling by an eventdev.  The
35``rte_event`` struct is a 16 byte C structure, defined in
36``libs/librte_eventdev/rte_eventdev.h``.
37
38Event Metadata
39~~~~~~~~~~~~~~
40
41The rte_event structure contains the following metadata fields, which the
42application fills in to have the event scheduled as required:
43
44* ``flow_id`` - The targeted flow identifier for the enq/deq operation.
45* ``event_type`` - The source of this event, e.g. RTE_EVENT_TYPE_ETHDEV or CPU.
46* ``sub_event_type`` - Distinguishes events inside the application, that have
47  the same event_type (see above)
48* ``op`` - This field takes one of the RTE_EVENT_OP_* values, and tells the
49  eventdev about the status of the event - valid values are NEW, FORWARD or
50  RELEASE.
51* ``sched_type`` - Represents the type of scheduling that should be performed
52  on this event, valid values are the RTE_SCHED_TYPE_ORDERED, ATOMIC and
53  PARALLEL.
54* ``queue_id`` - The identifier for the event queue that the event is sent to.
55* ``priority`` - The priority of this event, see RTE_EVENT_DEV_PRIORITY.
56
57Event Payload
58~~~~~~~~~~~~~
59
60The rte_event struct contains a union for payload, allowing flexibility in what
61the actual event being scheduled is. The payload is a union of the following:
62
63* ``uint64_t u64``
64* ``void *event_ptr``
65* ``struct rte_mbuf *mbuf``
66* ``struct rte_event_vector *vec``
67
68These four items in a union occupy the same 64 bits at the end of the rte_event
69structure. The application can utilize the 64 bits directly by accessing the
70u64 variable, while the event_ptr, mbuf, vec are provided as a convenience
71variables.  For example the mbuf pointer in the union can used to schedule a
72DPDK packet.
73
74Event Vector
75~~~~~~~~~~~~
76
77The rte_event_vector struct contains a vector of elements defined by the event
78type specified in the ``rte_event``. The event_vector structure contains the
79following data:
80
81* ``nb_elem`` - The number of elements held within the vector.
82
83Similar to ``rte_event`` the payload of event vector is also a union, allowing
84flexibility in what the actual vector is.
85
86* ``struct rte_mbuf *mbufs[0]`` - An array of mbufs.
87* ``void *ptrs[0]`` - An array of pointers.
88* ``uint64_t u64s[0]`` - An array of uint64_t elements.
89
90The size of the event vector is related to the total number of elements it is
91configured to hold, this is achieved by making `rte_event_vector` a variable
92length structure.
93A helper function is provided to create a mempool that holds event vector, which
94takes name of the pool, total number of required ``rte_event_vector``,
95cache size, number of elements in each ``rte_event_vector`` and socket id.
96
97.. code-block:: c
98
99        rte_event_vector_pool_create("vector_pool", nb_event_vectors, cache_sz,
100                                     nb_elements_per_vector, socket_id);
101
102The function ``rte_event_vector_pool_create`` creates mempool with the best
103platform mempool ops.
104
105Queues
106~~~~~~
107
108An event queue is a queue containing events that are scheduled by the event
109device. An event queue contains events of different flows associated with
110scheduling types, such as atomic, ordered, or parallel.
111
112Queue All Types Capable
113^^^^^^^^^^^^^^^^^^^^^^^
114
115If RTE_EVENT_DEV_CAP_QUEUE_ALL_TYPES capability bit is set in the event device,
116then events of any type may be sent to any queue. Otherwise, the queues only
117support events of the type that it was created with.
118
119Queue All Types Incapable
120^^^^^^^^^^^^^^^^^^^^^^^^^
121
122In this case, each stage has a specified scheduling type.  The application
123configures each queue for a specific type of scheduling, and just enqueues all
124events to the eventdev. An example of a PMD of this type is the eventdev
125software PMD.
126
127The Eventdev API supports the following scheduling types per queue:
128
129*   Atomic
130*   Ordered
131*   Parallel
132
133Atomic, Ordered and Parallel are load-balanced scheduling types: the output
134of the queue can be spread out over multiple CPU cores.
135
136Atomic scheduling on a queue ensures that a single flow is not present on two
137different CPU cores at the same time. Ordered allows sending all flows to any
138core, but the scheduler must ensure that on egress the packets are returned to
139ingress order on downstream queue enqueue. Parallel allows sending all flows
140to all CPU cores, without any re-ordering guarantees.
141
142Single Link Flag
143^^^^^^^^^^^^^^^^
144
145There is a SINGLE_LINK flag which allows an application to indicate that only
146one port will be connected to a queue.  Queues configured with the single-link
147flag follow a FIFO like structure, maintaining ordering but it is only capable
148of being linked to a single port (see below for port and queue linking details).
149
150
151Ports
152~~~~~
153
154Ports are the points of contact between worker cores and the eventdev. The
155general use case will see one CPU core using one port to enqueue and dequeue
156events from an eventdev. Ports are linked to queues in order to retrieve events
157from those queues (more details in `Linking Queues and Ports`_ below).
158
159
160API Walk-through
161----------------
162
163This section will introduce the reader to the eventdev API, showing how to
164create and configure an eventdev and use it for a two-stage atomic pipeline
165with one core each for RX and TX. RX and TX cores are shown here for
166illustration, refer to Eventdev Adapter documentation for further details.
167The diagram below shows the final state of the application after this
168walk-through:
169
170.. _figure_eventdev-usage1:
171
172.. figure:: ../img/eventdev_usage.*
173
174   Sample eventdev usage, with RX, two atomic stages and a single-link to TX.
175
176
177A high level overview of the setup steps are:
178
179* rte_event_dev_configure()
180* rte_event_queue_setup()
181* rte_event_port_setup()
182* rte_event_port_link()
183* rte_event_dev_start()
184
185
186Init and Config
187~~~~~~~~~~~~~~~
188
189The eventdev library uses vdev options to add devices to the DPDK application.
190The ``--vdev`` EAL option allows adding eventdev instances to your DPDK
191application, using the name of the eventdev PMD as an argument.
192
193For example, to create an instance of the software eventdev scheduler, the
194following vdev arguments should be provided to the application EAL command line:
195
196.. code-block:: console
197
198   ./dpdk_application --vdev="event_sw0"
199
200In the following code, we configure eventdev instance with 3 queues
201and 6 ports as follows. The 3 queues consist of 2 Atomic and 1 Single-Link,
202while the 6 ports consist of 4 workers, 1 RX and 1 TX.
203
204.. code-block:: c
205
206        const struct rte_event_dev_config config = {
207                .nb_event_queues = 3,
208                .nb_event_ports = 6,
209                .nb_events_limit  = 4096,
210                .nb_event_queue_flows = 1024,
211                .nb_event_port_dequeue_depth = 128,
212                .nb_event_port_enqueue_depth = 128,
213        };
214        int err = rte_event_dev_configure(dev_id, &config);
215
216The remainder of this walk-through assumes that dev_id is 0.
217
218Setting up Queues
219~~~~~~~~~~~~~~~~~
220
221Once the eventdev itself is configured, the next step is to configure queues.
222This is done by setting the appropriate values in a queue_conf structure, and
223calling the setup function. Repeat this step for each queue, starting from
2240 and ending at ``nb_event_queues - 1`` from the event_dev config above.
225
226.. code-block:: c
227
228        struct rte_event_queue_conf atomic_conf = {
229                .schedule_type = RTE_SCHED_TYPE_ATOMIC,
230                .priority = RTE_EVENT_DEV_PRIORITY_NORMAL,
231                .nb_atomic_flows = 1024,
232                .nb_atomic_order_sequences = 1024,
233        };
234        struct rte_event_queue_conf single_link_conf = {
235                .event_queue_cfg = RTE_EVENT_QUEUE_CFG_SINGLE_LINK,
236        };
237        int dev_id = 0;
238        int atomic_q_1 = 0;
239        int atomic_q_2 = 1;
240        int single_link_q = 2;
241        int err = rte_event_queue_setup(dev_id, atomic_q_1, &atomic_conf);
242        int err = rte_event_queue_setup(dev_id, atomic_q_2, &atomic_conf);
243        int err = rte_event_queue_setup(dev_id, single_link_q, &single_link_conf);
244
245As shown above, queue IDs are as follows:
246
247 * id 0, atomic queue #1
248 * id 1, atomic queue #2
249 * id 2, single-link queue
250
251These queues are used for the remainder of this walk-through.
252
253Setting up Ports
254~~~~~~~~~~~~~~~~
255
256Once queues are set up successfully, create the ports as required.
257
258.. code-block:: c
259
260        struct rte_event_port_conf rx_conf = {
261                .dequeue_depth = 128,
262                .enqueue_depth = 128,
263                .new_event_threshold = 1024,
264        };
265        struct rte_event_port_conf worker_conf = {
266                .dequeue_depth = 16,
267                .enqueue_depth = 64,
268                .new_event_threshold = 4096,
269        };
270        struct rte_event_port_conf tx_conf = {
271                .dequeue_depth = 128,
272                .enqueue_depth = 128,
273                .new_event_threshold = 4096,
274        };
275        int dev_id = 0;
276        int rx_port_id = 0;
277        int worker_port_id;
278        int err = rte_event_port_setup(dev_id, rx_port_id, &rx_conf);
279
280        for (worker_port_id = 1; worker_port_id <= 4; worker_port_id++) {
281	        int err = rte_event_port_setup(dev_id, worker_port_id, &worker_conf);
282        }
283
284        int tx_port_id = 5;
285	int err = rte_event_port_setup(dev_id, tx_port_id, &tx_conf);
286
287As shown above:
288
289 * port 0: RX core
290 * ports 1,2,3,4: Workers
291 * port 5: TX core
292
293These ports are used for the remainder of this walk-through.
294
295Linking Queues and Ports
296~~~~~~~~~~~~~~~~~~~~~~~~
297
298The final step is to "wire up" the ports to the queues. After this, the
299eventdev is capable of scheduling events, and when cores request work to do,
300the correct events are provided to that core. Note that the RX core takes input
301from e.g.: a NIC so it is not linked to any eventdev queues.
302
303Linking all workers to atomic queues, and the TX core to the single-link queue
304can be achieved like this:
305
306.. code-block:: c
307
308        uint8_t rx_port_id = 0;
309        uint8_t tx_port_id = 5;
310        uint8_t atomic_qs[] = {0, 1};
311        uint8_t single_link_q = 2;
312        uint8_t priority = RTE_EVENT_DEV_PRIORITY_NORMAL;
313        int worker_port_id;
314
315        for (worker_port_id = 1; worker_port_id <= 4; worker_port_id++) {
316                int links_made = rte_event_port_link(dev_id, worker_port_id, atomic_qs, NULL, 2);
317        }
318        int links_made = rte_event_port_link(dev_id, tx_port_id, &single_link_q, &priority, 1);
319
320Linking Queues to Ports with link profiles
321~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
322
323An application can use link profiles if supported by the underlying event device to setup up
324multiple link profile per port and change them run time depending up on heuristic data.
325Using Link profiles can reduce the overhead of linking/unlinking and wait for unlinks in progress
326in fast-path and gives applications the ability to switch between preset profiles on the fly.
327
328An example use case could be as follows.
329
330Config path:
331
332.. code-block:: c
333
334   uint8_t lq[4] = {4, 5, 6, 7};
335   uint8_t hq[4] = {0, 1, 2, 3};
336
337   if (rte_event_dev_info.max_profiles_per_port < 2)
338       return -ENOTSUP;
339
340   rte_event_port_profile_links_set(0, 0, hq, NULL, 4, 0);
341   rte_event_port_profile_links_set(0, 0, lq, NULL, 4, 1);
342
343Worker path:
344
345.. code-block:: c
346
347   uint8_t profile_id_to_switch;
348
349   while (1) {
350       deq = rte_event_dequeue_burst(0, 0, &ev, 1, 0);
351       if (deq == 0) {
352           profile_id_to_switch = app_find_profile_id_to_switch();
353           rte_event_port_profile_switch(0, 0, profile_id_to_switch);
354           continue;
355       }
356
357       // Process the event received.
358   }
359
360Starting the EventDev
361~~~~~~~~~~~~~~~~~~~~~
362
363A single function call tells the eventdev instance to start processing
364events. Note that all queues must be linked to for the instance to start, as
365if any queue is not linked to, enqueuing to that queue will cause the
366application to backpressure and eventually stall due to no space in the
367eventdev.
368
369.. code-block:: c
370
371        int err = rte_event_dev_start(dev_id);
372
373.. Note::
374
375         EventDev needs to be started before starting the event producers such
376         as event_eth_rx_adapter, event_timer_adapter, event_crypto_adapter and
377         event_dma_adapter.
378
379Ingress of New Events
380~~~~~~~~~~~~~~~~~~~~~
381
382Now that the eventdev is set up, and ready to receive events, the RX core must
383enqueue some events into the system for it to schedule. The events to be
384scheduled are ordinary DPDK packets, received from an eth_rx_burst() as normal.
385The following code shows how those packets can be enqueued into the eventdev:
386
387.. code-block:: c
388
389        const uint16_t nb_rx = rte_eth_rx_burst(eth_port, 0, mbufs, BATCH_SIZE);
390
391        for (i = 0; i < nb_rx; i++) {
392                ev[i].flow_id = mbufs[i]->hash.rss;
393                ev[i].op = RTE_EVENT_OP_NEW;
394                ev[i].sched_type = RTE_SCHED_TYPE_ATOMIC;
395                ev[i].queue_id = atomic_q_1;
396                ev[i].event_type = RTE_EVENT_TYPE_ETHDEV;
397                ev[i].sub_event_type = 0;
398                ev[i].priority = RTE_EVENT_DEV_PRIORITY_NORMAL;
399                ev[i].mbuf = mbufs[i];
400        }
401
402        const int nb_tx = rte_event_enqueue_burst(dev_id, rx_port_id, ev, nb_rx);
403        if (nb_tx != nb_rx) {
404                for(i = nb_tx; i < nb_rx; i++)
405                        rte_pktmbuf_free(mbufs[i]);
406        }
407
408Forwarding of Events
409~~~~~~~~~~~~~~~~~~~~
410
411Now that the RX core has injected events, there is work to be done by the
412workers. Note that each worker will dequeue as many events as it can in a burst,
413process each one individually, and then burst the packets back into the
414eventdev.
415
416The worker can lookup the events source from ``event.queue_id``, which should
417indicate to the worker what workload needs to be performed on the event.
418Once done, the worker can update the ``event.queue_id`` to a new value, to send
419the event to the next stage in the pipeline.
420
421.. code-block:: c
422
423        int timeout = 0;
424        struct rte_event events[BATCH_SIZE];
425        uint16_t nb_rx = rte_event_dequeue_burst(dev_id, worker_port_id, events, BATCH_SIZE, timeout);
426
427        for (i = 0; i < nb_rx; i++) {
428                /* process mbuf using events[i].queue_id as pipeline stage */
429                struct rte_mbuf *mbuf = events[i].mbuf;
430                /* Send event to next stage in pipeline */
431                events[i].queue_id++;
432        }
433
434        uint16_t nb_tx = rte_event_enqueue_burst(dev_id, worker_port_id, events, nb_rx);
435
436
437Egress of Events
438~~~~~~~~~~~~~~~~
439
440Finally, when the packet is ready for egress or needs to be dropped, we need
441to inform the eventdev that the packet is no longer being handled by the
442application. This can be done by calling dequeue() or dequeue_burst(), which
443indicates that the previous burst of packets is no longer in use by the
444application.
445
446An event driven worker thread has following typical workflow on fastpath:
447
448.. code-block:: c
449
450       while (1) {
451               rte_event_dequeue_burst(...);
452               (event processing)
453               rte_event_enqueue_burst(...);
454       }
455
456Quiescing Event Ports
457~~~~~~~~~~~~~~~~~~~~~
458
459To migrate the event port to another lcore
460or while tearing down a worker core using an event port,
461``rte_event_port_quiesce()`` can be invoked to make sure that all the data
462associated with the event port are released from the worker core,
463this might also include any prefetched events.
464
465A flush callback can be passed to the function to handle any outstanding events.
466
467.. code-block:: c
468
469        rte_event_port_quiesce(dev_id, port_id, release_cb, NULL);
470
471.. Note::
472
473        Invocation of this API does not affect the existing port configuration.
474
475Stopping the EventDev
476~~~~~~~~~~~~~~~~~~~~~
477
478A single function call tells the eventdev instance to stop processing events.
479A flush callback can be registered to free any inflight events
480using ``rte_event_dev_stop_flush_callback_register()`` function.
481
482.. code-block:: c
483
484        int err = rte_event_dev_stop(dev_id);
485
486.. Note::
487
488        The event producers such as ``event_eth_rx_adapter``,
489        ``event_timer_adapter``, ``event_crypto_adapter`` and
490        ``event_dma_adapter`` need to be stopped before stopping
491        the event device.
492
493Summary
494-------
495
496The eventdev library allows an application to easily schedule events as it
497requires, either using a run-to-completion or pipeline processing model.  The
498queues and ports abstract the logical functionality of an eventdev, providing
499the application with a generic method to schedule events.  With the flexible
500PMD infrastructure applications benefit of improvements in existing eventdevs
501and additions of new ones without modification.
502