1.. SPDX-License-Identifier: BSD-3-Clause 2 Copyright(c) 2019-2020 Intel Corporation. 3 4AF_XDP Poll Mode Driver 5========================== 6 7AF_XDP is an address family that is optimized for high performance 8packet processing. AF_XDP sockets enable the possibility for XDP program to 9redirect packets to a memory buffer in userspace. 10 11For the full details behind AF_XDP socket, you can refer to 12`AF_XDP documentation in the Kernel 13<https://www.kernel.org/doc/Documentation/networking/af_xdp.rst>`_. 14 15This Linux-specific PMD driver creates the AF_XDP socket and binds it to a 16specific netdev queue, it allows a DPDK application to send and receive raw 17packets through the socket which would bypass the kernel network stack. 18Current implementation only supports single queue, multi-queues feature will 19be added later. 20 21AF_XDP PMD enables need_wakeup flag by default if it is supported. This 22need_wakeup feature is used to support executing application and driver on the 23same core efficiently. This feature not only has a large positive performance 24impact for the one core case, but also does not degrade 2 core performance and 25actually improves it for Tx heavy workloads. 26 27Options 28------- 29 30The following options can be provided to set up an af_xdp port in DPDK. 31 32* ``iface`` - name of the Kernel interface to attach to (required); 33* ``start_queue`` - starting netdev queue id (optional, default 0); 34* ``queue_count`` - total netdev queue number (optional, default 1); 35* ``shared_umem`` - PMD will attempt to share UMEM with others (optional, 36 default 0); 37* ``xdp_prog`` - path to custom xdp program (optional, default none); 38* ``busy_budget`` - busy polling budget (optional, default 64); 39 40Prerequisites 41------------- 42 43This is a Linux-specific PMD, thus the following prerequisites apply: 44 45* A Linux Kernel (version > v4.18) with XDP sockets configuration enabled; 46* libbpf (within kernel version > v5.1-rc4) with latest af_xdp support installed, 47 User can install libbpf via `make install_lib` && `make install_headers` in 48 <kernel src tree>/tools/lib/bpf; 49* A Kernel bound interface to attach to; 50* For need_wakeup feature, it requires kernel version later than v5.3-rc1; 51* For PMD zero copy, it requires kernel version later than v5.4-rc1; 52* For shared_umem, it requires kernel version v5.10 or later and libbpf version 53 v0.2.0 or later. 54* For 32-bit OS, a kernel with version 5.4 or later is required. 55* For busy polling, kernel version v5.11 or later is required. 56 57Set up an af_xdp interface 58----------------------------- 59 60The following example will set up an af_xdp interface in DPDK: 61 62.. code-block:: console 63 64 --vdev net_af_xdp,iface=ens786f1 65 66Limitations 67----------- 68 69- **MTU** 70 71 The MTU of the AF_XDP PMD is limited due to the XDP requirement of one packet 72 per page. In the PMD we report the maximum MTU for zero copy to be equal 73 to the page size less the frame overhead introduced by AF_XDP (XDP HR = 256) 74 and DPDK (frame headroom = 320). With a 4K page size this works out at 3520. 75 However in practice this value may be even smaller, due to differences between 76 the supported RX buffer sizes of the underlying kernel netdev driver. 77 78 For example, the largest RX buffer size supported by the underlying kernel driver 79 which is less than the page size (4096B) may be 3072B. In this case, the maximum 80 MTU value will be at most 3072, but likely even smaller than this, once relevant 81 headers are accounted for eg. Ethernet and VLAN. 82 83 To determine the actual maximum MTU value of the interface you are using with the 84 AF_XDP PMD, consult the documentation for the kernel driver. 85 86 Note: The AF_XDP PMD will fail to initialise if an MTU which violates the driver's 87 conditions as above is set prior to launching the application. 88 89- **Shared UMEM** 90 91 The sharing of UMEM is only supported for AF_XDP sockets with unique contexts. 92 The context refers to the netdev,qid tuple. 93 94 The following combination will fail: 95 96 .. code-block:: console 97 98 --vdev net_af_xdp0,iface=ens786f1,shared_umem=1 \ 99 --vdev net_af_xdp1,iface=ens786f1,shared_umem=1 \ 100 101 Either of the following however is permitted since either the netdev or qid differs 102 between the two vdevs: 103 104 .. code-block:: console 105 106 --vdev net_af_xdp0,iface=ens786f1,shared_umem=1 \ 107 --vdev net_af_xdp1,iface=ens786f1,start_queue=1,shared_umem=1 \ 108 109 .. code-block:: console 110 111 --vdev net_af_xdp0,iface=ens786f1,shared_umem=1 \ 112 --vdev net_af_xdp1,iface=ens786f2,shared_umem=1 \ 113 114- **Preferred Busy Polling** 115 116 The SO_PREFER_BUSY_POLL socket option was introduced in kernel v5.11. It can 117 deliver a performance improvement for sockets with heavy traffic loads and 118 can significantly improve single-core performance in this context. 119 120 The feature is enabled by default in the AF_XDP PMD. To disable it, set the 121 'busy_budget' vdevarg to zero: 122 123 .. code-block:: console 124 125 --vdev net_af_xdp0,iface=ens786f1,busy_budget=0 126 127 The default 'busy_budget' is 64 and it represents the number of packets the 128 kernel will attempt to process in the netdev's NAPI context. You can change 129 the value for example to 256 like so: 130 131 .. code-block:: console 132 133 --vdev net_af_xdp0,iface=ens786f1,busy_budget=256 134 135 It is also strongly recommended to set the following for optimal performance: 136 137 .. code-block:: console 138 139 echo 2 | sudo tee /sys/class/net/ens786f1/napi_defer_hard_irqs 140 echo 200000 | sudo tee /sys/class/net/ens786f1/gro_flush_timeout 141 142 The above defers interrupts for interface ens786f1 and instead schedules its 143 NAPI context from a watchdog timer instead of from softirqs. More information 144 on this feature can be found at [1]. 145 146 [1] https://lwn.net/Articles/837010/