1.. SPDX-License-Identifier: BSD-3-Clause 2 Copyright(c) 2019-2021 Intel Corporation. 3 4.. include:: <isonum.txt> 5 6Packet copying using DMAdev library 7=================================== 8 9Overview 10-------- 11 12This sample is intended as a demonstration of the basic components of a DPDK 13forwarding application and an example of how to use the DMAdev API to make a packet 14copy application. 15 16Also, while forwarding, the MAC addresses are affected as follows: 17 18* The source MAC address is replaced by the TX port MAC address 19 20* The destination MAC address is replaced by 02:00:00:00:00:TX_PORT_ID 21 22This application can be used to compare performance of using software packet 23copy with copy done using a DMA device for different sizes of packets. 24The example will print out statistics each second. The stats shows 25received/send packets and packets dropped or failed to copy. 26 27Compiling the Application 28------------------------- 29 30To compile the sample application, see :doc:`compiling`. 31 32The application is located in the ``dma`` sub-directory. 33 34 35Running the Application 36----------------------- 37 38In order to run the hardware copy application, the copying device 39needs to be bound to user-space IO driver. 40 41Refer to the :doc:`../prog_guide/dmadev` for information on using the library. 42 43The application requires a number of command line options: 44 45.. code-block:: console 46 47 ./<build_dir>/examples/dpdk-dma [EAL options] -- [-p MASK] [-q NQ] [-s RS] [-c <sw|hw>] 48 [--[no-]mac-updating] [-b BS] [-f FS] [-i SI] 49 50where, 51 52* p MASK: A hexadecimal bitmask of the ports to configure (default is all) 53 54* q NQ: Number of Rx queues used per port equivalent to DMA channels 55 per port (default is 1) 56 57* c CT: Performed packet copy type: software (sw) or hardware using 58 DMA (hw) (default is hw) 59 60* s RS: Size of dmadev descriptor ring for hardware copy mode or rte_ring for 61 software copy mode (default is 2048) 62 63* --[no-]mac-updating: Whether MAC address of packets should be changed 64 or not (default is mac-updating) 65 66* b BS: set the DMA batch size 67 68* f FS: set the max frame size 69 70* i SI: set the interval, in second, between statistics prints (default is 1) 71 72The application can be launched in various configurations depending on the 73provided parameters. The app can use up to 2 lcores: one of them receives 74incoming traffic and makes a copy of each packet. The second lcore then 75updates the MAC address and sends the copy. If one lcore per port is used, 76both operations are done sequentially. For each configuration, an additional 77lcore is needed since the main lcore does not handle traffic but is 78responsible for configuration, statistics printing and safe shutdown of 79all ports and devices. 80 81The application can use a maximum of 8 ports. 82 83To run the application in a Linux environment with 3 lcores (the main lcore, 84plus two forwarding cores), a single port (port 0), software copying and MAC 85updating issue the command: 86 87.. code-block:: console 88 89 $ ./<build_dir>/examples/dpdk-dma -l 0-2 -n 2 -- -p 0x1 --mac-updating -c sw 90 91To run the application in a Linux environment with 2 lcores (the main lcore, 92plus one forwarding core), 2 ports (ports 0 and 1), hardware copying and no MAC 93updating issue the command: 94 95.. code-block:: console 96 97 $ ./<build_dir>/examples/dpdk-dma -l 0-1 -n 1 -- -p 0x3 --no-mac-updating -c hw 98 99Refer to the *DPDK Getting Started Guide* for general information on 100running applications and the Environment Abstraction Layer (EAL) options. 101 102Explanation 103----------- 104 105The following sections provide an explanation of the main components of the 106code. 107 108All DPDK library functions used in the sample code are prefixed with 109``rte_`` and are explained in detail in the *DPDK API Documentation*. 110 111 112The Main Function 113~~~~~~~~~~~~~~~~~ 114 115The ``main()`` function performs the initialization and calls the execution 116threads for each lcore. 117 118The first task is to initialize the Environment Abstraction Layer (EAL). 119The ``argc`` and ``argv`` arguments are provided to the ``rte_eal_init()`` 120function. The value returned is the number of parsed arguments: 121 122.. literalinclude:: ../../../examples/dma/dmafwd.c 123 :language: c 124 :start-after: Init EAL. 8< 125 :end-before: >8 End of init EAL. 126 :dedent: 1 127 128 129The ``main()`` also allocates a mempool to hold the mbufs (Message Buffers) 130used by the application: 131 132.. literalinclude:: ../../../examples/dma/dmafwd.c 133 :language: c 134 :start-after: Allocates mempool to hold the mbufs. 8< 135 :end-before: >8 End of allocates mempool to hold the mbufs. 136 :dedent: 1 137 138Mbufs are the packet buffer structure used by DPDK. They are explained in 139detail in the "Mbuf Library" section of the *DPDK Programmer's Guide*. 140 141The ``main()`` function also initializes the ports: 142 143.. literalinclude:: ../../../examples/dma/dmafwd.c 144 :language: c 145 :start-after: Initialize each port. 8< 146 :end-before: >8 End of initializing each port. 147 :dedent: 1 148 149Each port is configured using ``port_init()`` function. The Ethernet 150ports are configured with local settings using the ``rte_eth_dev_configure()`` 151function and the ``port_conf`` struct. The RSS is enabled so that 152multiple Rx queues could be used for packet receiving and copying by 153multiple DMA channels per port: 154 155.. literalinclude:: ../../../examples/dma/dmafwd.c 156 :language: c 157 :start-after: Configuring port to use RSS for multiple RX queues. 8< 158 :end-before: >8 End of configuring port to use RSS for multiple RX queues. 159 :dedent: 1 160 161For this example, the ports are set up with the number of Rx queues provided 162with -q option and 1 Tx queue using the ``rte_eth_rx_queue_setup()`` 163and ``rte_eth_tx_queue_setup()`` functions. 164 165The Ethernet port is then started: 166 167.. literalinclude:: ../../../examples/dma/dmafwd.c 168 :language: c 169 :start-after: Start device. 8< 170 :end-before: >8 End of starting device. 171 :dedent: 1 172 173 174Finally, the Rx port is set in promiscuous mode: 175 176.. literalinclude:: ../../../examples/dma/dmafwd.c 177 :language: c 178 :start-after: RX port is set in promiscuous mode. 8< 179 :end-before: >8 End of RX port is set in promiscuous mode. 180 :dedent: 1 181 182 183After that, each port application assigns resources needed. 184 185.. literalinclude:: ../../../examples/dma/dmafwd.c 186 :language: c 187 :start-after: Assigning each port resources. 8< 188 :end-before: >8 End of assigning each port resources. 189 :dedent: 1 190 191Ring structures are assigned for exchanging packets between lcores for both SW 192and HW copy modes. 193 194.. literalinclude:: ../../../examples/dma/dmafwd.c 195 :language: c 196 :start-after: Assign ring structures for packet exchanging. 8< 197 :end-before: >8 End of assigning ring structures for packet exchanging. 198 :dedent: 0 199 200 201When using hardware copy each Rx queue of the port is assigned a DMA device 202(``assign_dmadevs()``) using DMAdev library API functions: 203 204.. literalinclude:: ../../../examples/dma/dmafwd.c 205 :language: c 206 :start-after: Using dmadev API functions. 8< 207 :end-before: >8 End of using dmadev API functions. 208 :dedent: 0 209 210 211The initialization of hardware device is done by ``rte_dma_configure()`` and 212``rte_dma_vchan_setup()`` functions using the ``rte_dma_conf`` and 213``rte_dma_vchan_conf`` structs. After configuration the device is started 214using ``rte_dma_start()`` function. Each of the above operations is done in 215``configure_dmadev_queue()``. 216 217.. literalinclude:: ../../../examples/dma/dmafwd.c 218 :language: c 219 :start-after: Configuration of device. 8< 220 :end-before: >8 End of configuration of device. 221 :dedent: 0 222 223If initialization is successful, memory for hardware device 224statistics is allocated. 225 226Finally, the ``main()`` function starts all packet handling lcores and starts 227printing stats in a loop on the main lcore. The application can be 228interrupted and closed using ``Ctrl-C``. The main lcore waits for 229all worker lcores to finish, deallocates resources and exits. 230 231The processing lcores launching function are described below. 232 233The Lcores Launching Functions 234~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 235 236As described above, ``main()`` function invokes ``start_forwarding_cores()`` 237function in order to start processing for each lcore: 238 239.. literalinclude:: ../../../examples/dma/dmafwd.c 240 :language: c 241 :start-after: Start processing for each lcore. 8< 242 :end-before: >8 End of starting to process for each lcore. 243 :dedent: 0 244 245The function launches Rx/Tx processing functions on configured lcores 246using ``rte_eal_remote_launch()``. The configured ports, their number 247and number of assigned lcores are stored in user-defined 248``rxtx_transmission_config`` struct: 249 250.. literalinclude:: ../../../examples/dma/dmafwd.c 251 :language: c 252 :start-after: Configuring ports and number of assigned lcores in struct. 8< 253 :end-before: >8 End of configuration of ports and number of assigned lcores. 254 :dedent: 0 255 256The structure is initialized in 'main()' function with the values 257corresponding to ports and lcores configuration provided by the user. 258 259The Lcores Processing Functions 260~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 261 262For receiving packets on each port, the ``dma_rx_port()`` function is used. 263The function receives packets on each configured Rx queue. Depending on the 264mode the user chose, it will enqueue packets to DMA channels and 265then invoke copy process (hardware copy), or perform software copy of each 266packet using ``pktmbuf_sw_copy()`` function and enqueue them to an rte_ring: 267 268.. literalinclude:: ../../../examples/dma/dmafwd.c 269 :language: c 270 :start-after: Receive packets on one port and enqueue to dmadev or rte_ring. 8< 271 :end-before: >8 End of receive packets on one port and enqueue to dmadev or rte_ring. 272 :dedent: 0 273 274The packets are received in burst mode using ``rte_eth_rx_burst()`` 275function. When using hardware copy mode the packets are enqueued in the 276copying device's buffer using ``dma_enqueue_packets()`` which calls 277``rte_dma_copy()``. When all received packets are in the 278buffer, the copy operations are started by calling ``rte_dma_submit()``. 279Function ``rte_dma_copy()`` operates on physical address of 280the packet. Structure ``rte_mbuf`` contains only physical address to 281start of the data buffer (``buf_iova``). Thus, the ``rte_pktmbuf_iova()`` API is 282used to get the address of the start of the data within the mbuf. 283 284.. literalinclude:: ../../../examples/dma/dmafwd.c 285 :language: c 286 :start-after: Receive packets on one port and enqueue to dmadev or rte_ring. 8< 287 :end-before: >8 End of receive packets on one port and enqueue to dmadev or rte_ring. 288 :dedent: 0 289 290 291Once the copies have been completed (this includes gathering the completions in 292HW copy mode), the copied packets are enqueued to the ``rx_to_tx_ring``, which 293is used to pass the packets to the Tx function. 294 295All completed copies are processed by ``dma_tx_port()`` function. This function 296dequeues copied packets from the ``rx_to_tx_ring``. Then, each packet MAC address is changed 297if it was enabled. After that, copies are sent in burst mode using ``rte_eth_tx_burst()``. 298 299 300.. literalinclude:: ../../../examples/dma/dmafwd.c 301 :language: c 302 :start-after: Transmit packets from dmadev/rte_ring for one port. 8< 303 :end-before: >8 End of transmitting packets from dmadev. 304 :dedent: 0 305 306The Packet Copying Functions 307~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 308 309In order to perform SW packet copy, there are user-defined functions to the first copy 310the packet metadata (``pktmbuf_metadata_copy()``) and then the packet data 311(``pktmbuf_sw_copy()``): 312 313.. literalinclude:: ../../../examples/dma/dmafwd.c 314 :language: c 315 :start-after: Perform packet copy there is a user-defined function. 8< 316 :end-before: >8 End of perform packet copy there is a user-defined function. 317 :dedent: 0 318 319The metadata in this example is copied from ``rx_descriptor_fields1`` marker of 320``rte_mbuf`` struct up to ``buf_len`` member. 321 322In order to understand why software packet copying is done as shown 323above, please refer to the :doc:`../prog_guide/mbuf_lib`. 324