1.. SPDX-License-Identifier: BSD-3-Clause 2 Copyright(c) 2019-2021 Intel Corporation. 3 4.. include:: <isonum.txt> 5 6Packet copying using DMAdev library 7=================================== 8 9Overview 10-------- 11 12This sample is intended as a demonstration of the basic components of a DPDK 13forwarding application and example of how to use the DMAdev API to make a packet 14copy application. 15 16Also while forwarding, the MAC addresses are affected as follows: 17 18* The source MAC address is replaced by the TX port MAC address 19 20* The destination MAC address is replaced by 02:00:00:00:00:TX_PORT_ID 21 22This application can be used to compare performance of using software packet 23copy with copy done using a DMA device for different sizes of packets. 24The example will print out statistics each second. The stats shows 25received/send packets and packets dropped or failed to copy. 26 27Compiling the Application 28------------------------- 29 30To compile the sample application see :doc:`compiling`. 31 32The application is located in the ``dma`` sub-directory. 33 34 35Running the Application 36----------------------- 37 38In order to run the hardware copy application, the copying device 39needs to be bound to user-space IO driver. 40 41Refer to the "DMAdev library" chapter in the "Programmers guide" for information 42on using the library. 43 44The application requires a number of command line options: 45 46.. code-block:: console 47 48 ./<build_dir>/examples/dpdk-ioat [EAL options] -- [-p MASK] [-q NQ] [-s RS] [-c <sw|hw>] 49 [--[no-]mac-updating] [-b BS] [-f FS] [-i SI] 50 51where, 52 53* p MASK: A hexadecimal bitmask of the ports to configure (default is all) 54 55* q NQ: Number of Rx queues used per port equivalent to DMA channels 56 per port (default is 1) 57 58* c CT: Performed packet copy type: software (sw) or hardware using 59 DMA (hw) (default is hw) 60 61* s RS: Size of dmadev descriptor ring for hardware copy mode or rte_ring for 62 software copy mode (default is 2048) 63 64* --[no-]mac-updating: Whether MAC address of packets should be changed 65 or not (default is mac-updating) 66 67* b BS: set the DMA batch size 68 69* f FS: set the max frame size 70 71* i SI: set the interval, in second, between statistics prints (default is 1) 72 73The application can be launched in various configurations depending on 74provided parameters. The app can use up to 2 lcores: one of them receives 75incoming traffic and makes a copy of each packet. The second lcore then 76updates MAC address and sends the copy. If one lcore per port is used, 77both operations are done sequentially. For each configuration an additional 78lcore is needed since the main lcore does not handle traffic but is 79responsible for configuration, statistics printing and safe shutdown of 80all ports and devices. 81 82The application can use a maximum of 8 ports. 83 84To run the application in a Linux environment with 3 lcores (the main lcore, 85plus two forwarding cores), a single port (port 0), software copying and MAC 86updating issue the command: 87 88.. code-block:: console 89 90 $ ./<build_dir>/examples/dpdk-dma -l 0-2 -n 2 -- -p 0x1 --mac-updating -c sw 91 92To run the application in a Linux environment with 2 lcores (the main lcore, 93plus one forwarding core), 2 ports (ports 0 and 1), hardware copying and no MAC 94updating issue the command: 95 96.. code-block:: console 97 98 $ ./<build_dir>/examples/dpdk-dma -l 0-1 -n 1 -- -p 0x3 --no-mac-updating -c hw 99 100Refer to the *DPDK Getting Started Guide* for general information on 101running applications and the Environment Abstraction Layer (EAL) options. 102 103Explanation 104----------- 105 106The following sections provide an explanation of the main components of the 107code. 108 109All DPDK library functions used in the sample code are prefixed with 110``rte_`` and are explained in detail in the *DPDK API Documentation*. 111 112 113The Main Function 114~~~~~~~~~~~~~~~~~ 115 116The ``main()`` function performs the initialization and calls the execution 117threads for each lcore. 118 119The first task is to initialize the Environment Abstraction Layer (EAL). 120The ``argc`` and ``argv`` arguments are provided to the ``rte_eal_init()`` 121function. The value returned is the number of parsed arguments: 122 123.. literalinclude:: ../../../examples/dma/dmafwd.c 124 :language: c 125 :start-after: Init EAL. 8< 126 :end-before: >8 End of init EAL. 127 :dedent: 1 128 129 130The ``main()`` also allocates a mempool to hold the mbufs (Message Buffers) 131used by the application: 132 133.. literalinclude:: ../../../examples/dma/dmafwd.c 134 :language: c 135 :start-after: Allocates mempool to hold the mbufs. 8< 136 :end-before: >8 End of allocates mempool to hold the mbufs. 137 :dedent: 1 138 139Mbufs are the packet buffer structure used by DPDK. They are explained in 140detail in the "Mbuf Library" section of the *DPDK Programmer's Guide*. 141 142The ``main()`` function also initializes the ports: 143 144.. literalinclude:: ../../../examples/dma/dmafwd.c 145 :language: c 146 :start-after: Initialize each port. 8< 147 :end-before: >8 End of initializing each port. 148 :dedent: 1 149 150Each port is configured using ``port_init()`` function. The Ethernet 151ports are configured with local settings using the ``rte_eth_dev_configure()`` 152function and the ``port_conf`` struct. The RSS is enabled so that 153multiple Rx queues could be used for packet receiving and copying by 154multiple DMA channels per port: 155 156.. literalinclude:: ../../../examples/dma/dmafwd.c 157 :language: c 158 :start-after: Configuring port to use RSS for multiple RX queues. 8< 159 :end-before: >8 End of configuring port to use RSS for multiple RX queues. 160 :dedent: 1 161 162For this example the ports are set up with the number of Rx queues provided 163with -q option and 1 Tx queue using the ``rte_eth_rx_queue_setup()`` 164and ``rte_eth_tx_queue_setup()`` functions. 165 166The Ethernet port is then started: 167 168.. literalinclude:: ../../../examples/dma/dmafwd.c 169 :language: c 170 :start-after: Start device. 8< 171 :end-before: >8 End of starting device. 172 :dedent: 1 173 174 175Finally the Rx port is set in promiscuous mode: 176 177.. literalinclude:: ../../../examples/dma/dmafwd.c 178 :language: c 179 :start-after: RX port is set in promiscuous mode. 8< 180 :end-before: >8 End of RX port is set in promiscuous mode. 181 :dedent: 1 182 183 184After that each port application assigns resources needed. 185 186.. literalinclude:: ../../../examples/dma/dmafwd.c 187 :language: c 188 :start-after: Assigning each port resources. 8< 189 :end-before: >8 End of assigning each port resources. 190 :dedent: 1 191 192Ring structures are assigned for exchanging packets between lcores for both SW 193and HW copy modes. 194 195.. literalinclude:: ../../../examples/dma/dmafwd.c 196 :language: c 197 :start-after: Assign ring structures for packet exchanging. 8< 198 :end-before: >8 End of assigning ring structures for packet exchanging. 199 :dedent: 0 200 201 202When using hardware copy each Rx queue of the port is assigned a DMA device 203(``assign_dmadevs()``) using DMAdev library API functions: 204 205.. literalinclude:: ../../../examples/dma/dmafwd.c 206 :language: c 207 :start-after: Using dmadev API functions. 8< 208 :end-before: >8 End of using dmadev API functions. 209 :dedent: 0 210 211 212The initialization of hardware device is done by ``rte_dma_configure()`` and 213``rte_dma_vchan_setup()`` functions using the ``rte_dma_conf`` and 214``rte_dma_vchan_conf`` structs. After configuration the device is started 215using ``rte_dma_start()`` function. Each of the above operations is done in 216``configure_dmadev_queue()``. 217 218.. literalinclude:: ../../../examples/dma/dmafwd.c 219 :language: c 220 :start-after: Configuration of device. 8< 221 :end-before: >8 End of configuration of device. 222 :dedent: 0 223 224If initialization is successful, memory for hardware device 225statistics is allocated. 226 227Finally ``main()`` function starts all packet handling lcores and starts 228printing stats in a loop on the main lcore. The application can be 229interrupted and closed using ``Ctrl-C``. The main lcore waits for 230all worker lcores to finish, deallocates resources and exits. 231 232The processing lcores launching function are described below. 233 234The Lcores Launching Functions 235~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 236 237As described above, ``main()`` function invokes ``start_forwarding_cores()`` 238function in order to start processing for each lcore: 239 240.. literalinclude:: ../../../examples/dma/dmafwd.c 241 :language: c 242 :start-after: Start processing for each lcore. 8< 243 :end-before: >8 End of starting to processfor each lcore. 244 :dedent: 0 245 246The function launches Rx/Tx processing functions on configured lcores 247using ``rte_eal_remote_launch()``. The configured ports, their number 248and number of assigned lcores are stored in user-defined 249``rxtx_transmission_config`` struct: 250 251.. literalinclude:: ../../../examples/dma/dmafwd.c 252 :language: c 253 :start-after: Configuring ports and number of assigned lcores in struct. 8< 254 :end-before: >8 End of configuration of ports and number of assigned lcores. 255 :dedent: 0 256 257The structure is initialized in 'main()' function with the values 258corresponding to ports and lcores configuration provided by the user. 259 260The Lcores Processing Functions 261~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 262 263For receiving packets on each port, the ``dma_rx_port()`` function is used. 264The function receives packets on each configured Rx queue. Depending on the 265mode the user chose, it will enqueue packets to DMA channels and 266then invoke copy process (hardware copy), or perform software copy of each 267packet using ``pktmbuf_sw_copy()`` function and enqueue them to an rte_ring: 268 269.. literalinclude:: ../../../examples/dma/dmafwd.c 270 :language: c 271 :start-after: Receive packets on one port and enqueue to dmadev or rte_ring. 8< 272 :end-before: >8 End of receive packets on one port and enqueue to dmadev or rte_ring. 273 :dedent: 0 274 275The packets are received in burst mode using ``rte_eth_rx_burst()`` 276function. When using hardware copy mode the packets are enqueued in 277copying device's buffer using ``dma_enqueue_packets()`` which calls 278``rte_dma_copy()``. When all received packets are in the 279buffer the copy operations are started by calling ``rte_dma_submit()``. 280Function ``rte_dma_copy()`` operates on physical address of 281the packet. Structure ``rte_mbuf`` contains only physical address to 282start of the data buffer (``buf_iova``). Thus the ``rte_pktmbuf_iova()`` API is 283used to get the address of the start of the data within the mbuf. 284 285.. literalinclude:: ../../../examples/dma/dmafwd.c 286 :language: c 287 :start-after: Receive packets on one port and enqueue to dmadev or rte_ring. 8< 288 :end-before: >8 End of receive packets on one port and enqueue to dmadev or rte_ring. 289 :dedent: 0 290 291 292Once the copies have been completed (this includes gathering the completions in 293HW copy mode), the copied packets are enqueued to the ``rx_to_tx_ring``, which 294is used to pass the packets to the TX function. 295 296All completed copies are processed by ``dma_tx_port()`` function. This function 297dequeues copied packets from the ``rx_to_tx_ring``. Then each packet MAC address is changed 298if it was enabled. After that copies are sent in burst mode using ``rte_eth_tx_burst()``. 299 300 301.. literalinclude:: ../../../examples/dma/dmafwd.c 302 :language: c 303 :start-after: Transmit packets from dmadev/rte_ring for one port. 8< 304 :end-before: >8 End of transmitting packets from dmadev. 305 :dedent: 0 306 307The Packet Copying Functions 308~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 309 310In order to perform SW packet copy, there are user-defined functions to first copy 311the packet metadata (``pktmbuf_metadata_copy()``) and then the packet data 312(``pktmbuf_sw_copy()``): 313 314.. literalinclude:: ../../../examples/dma/dmafwd.c 315 :language: c 316 :start-after: Perform packet copy there is a user-defined function. 8< 317 :end-before: >8 End of perform packet copy there is a user-defined function. 318 :dedent: 0 319 320The metadata in this example is copied from ``rx_descriptor_fields1`` marker of 321``rte_mbuf`` struct up to ``buf_len`` member. 322 323In order to understand why software packet copying is done as shown 324above please refer to the "Mbuf Library" section of the 325*DPDK Programmer's Guide*. 326