xref: /dpdk/doc/guides/nics/af_xdp.rst (revision 68a03efeed657e6e05f281479b33b51102797e15)
1..  SPDX-License-Identifier: BSD-3-Clause
2    Copyright(c) 2019-2020 Intel Corporation.
3
4AF_XDP Poll Mode Driver
5==========================
6
7AF_XDP is an address family that is optimized for high performance
8packet processing. AF_XDP sockets enable the possibility for XDP program to
9redirect packets to a memory buffer in userspace.
10
11For the full details behind AF_XDP socket, you can refer to
12`AF_XDP documentation in the Kernel
13<https://www.kernel.org/doc/Documentation/networking/af_xdp.rst>`_.
14
15This Linux-specific PMD driver creates the AF_XDP socket and binds it to a
16specific netdev queue, it allows a DPDK application to send and receive raw
17packets through the socket which would bypass the kernel network stack.
18Current implementation only supports single queue, multi-queues feature will
19be added later.
20
21AF_XDP PMD enables need_wakeup flag by default if it is supported. This
22need_wakeup feature is used to support executing application and driver on the
23same core efficiently. This feature not only has a large positive performance
24impact for the one core case, but also does not degrade 2 core performance and
25actually improves it for Tx heavy workloads.
26
27Options
28-------
29
30The following options can be provided to set up an af_xdp port in DPDK.
31
32*   ``iface`` - name of the Kernel interface to attach to (required);
33*   ``start_queue`` - starting netdev queue id (optional, default 0);
34*   ``queue_count`` - total netdev queue number (optional, default 1);
35*   ``shared_umem`` - PMD will attempt to share UMEM with others (optional,
36    default 0);
37*   ``xdp_prog`` - path to custom xdp program (optional, default none);
38*   ``busy_budget`` - busy polling budget (optional, default 64);
39
40Prerequisites
41-------------
42
43This is a Linux-specific PMD, thus the following prerequisites apply:
44
45*  A Linux Kernel (version > v4.18) with XDP sockets configuration enabled;
46*  libbpf (within kernel version > v5.1-rc4) with latest af_xdp support installed,
47   User can install libbpf via `make install_lib` && `make install_headers` in
48   <kernel src tree>/tools/lib/bpf;
49*  A Kernel bound interface to attach to;
50*  For need_wakeup feature, it requires kernel version later than v5.3-rc1;
51*  For PMD zero copy, it requires kernel version later than v5.4-rc1;
52*  For shared_umem, it requires kernel version v5.10 or later and libbpf version
53   v0.2.0 or later.
54*  For 32-bit OS, a kernel with version 5.4 or later is required.
55*  For busy polling, kernel version v5.11 or later is required.
56
57Set up an af_xdp interface
58-----------------------------
59
60The following example will set up an af_xdp interface in DPDK:
61
62.. code-block:: console
63
64    --vdev net_af_xdp,iface=ens786f1
65
66Limitations
67-----------
68
69- **MTU**
70
71  The MTU of the AF_XDP PMD is limited due to the XDP requirement of one packet
72  per page. In the PMD we report the maximum MTU for zero copy to be equal
73  to the page size less the frame overhead introduced by AF_XDP (XDP HR = 256)
74  and DPDK (frame headroom = 320). With a 4K page size this works out at 3520.
75  However in practice this value may be even smaller, due to differences between
76  the supported RX buffer sizes of the underlying kernel netdev driver.
77
78  For example, the largest RX buffer size supported by the underlying kernel driver
79  which is less than the page size (4096B) may be 3072B. In this case, the maximum
80  MTU value will be at most 3072, but likely even smaller than this, once relevant
81  headers are accounted for eg. Ethernet and VLAN.
82
83  To determine the actual maximum MTU value of the interface you are using with the
84  AF_XDP PMD, consult the documentation for the kernel driver.
85
86  Note: The AF_XDP PMD will fail to initialise if an MTU which violates the driver's
87  conditions as above is set prior to launching the application.
88
89- **Shared UMEM**
90
91  The sharing of UMEM is only supported for AF_XDP sockets with unique contexts.
92  The context refers to the netdev,qid tuple.
93
94  The following combination will fail:
95
96  .. code-block:: console
97
98    --vdev net_af_xdp0,iface=ens786f1,shared_umem=1 \
99    --vdev net_af_xdp1,iface=ens786f1,shared_umem=1 \
100
101  Either of the following however is permitted since either the netdev or qid differs
102  between the two vdevs:
103
104  .. code-block:: console
105
106    --vdev net_af_xdp0,iface=ens786f1,shared_umem=1 \
107    --vdev net_af_xdp1,iface=ens786f1,start_queue=1,shared_umem=1 \
108
109  .. code-block:: console
110
111    --vdev net_af_xdp0,iface=ens786f1,shared_umem=1 \
112    --vdev net_af_xdp1,iface=ens786f2,shared_umem=1 \
113
114- **Preferred Busy Polling**
115
116  The SO_PREFER_BUSY_POLL socket option was introduced in kernel v5.11. It can
117  deliver a performance improvement for sockets with heavy traffic loads and
118  can significantly improve single-core performance in this context.
119
120  The feature is enabled by default in the AF_XDP PMD. To disable it, set the
121  'busy_budget' vdevarg to zero:
122
123  .. code-block:: console
124
125    --vdev net_af_xdp0,iface=ens786f1,busy_budget=0
126
127  The default 'busy_budget' is 64 and it represents the number of packets the
128  kernel will attempt to process in the netdev's NAPI context. You can change
129  the value for example to 256 like so:
130
131  .. code-block:: console
132
133    --vdev net_af_xdp0,iface=ens786f1,busy_budget=256
134
135  It is also strongly recommended to set the following for optimal performance:
136
137  .. code-block:: console
138
139    echo 2 | sudo tee /sys/class/net/ens786f1/napi_defer_hard_irqs
140    echo 200000 | sudo tee /sys/class/net/ens786f1/gro_flush_timeout
141
142  The above defers interrupts for interface ens786f1 and instead schedules its
143  NAPI context from a watchdog timer instead of from softirqs. More information
144  on this feature can be found at [1].
145
146  [1] https://lwn.net/Articles/837010/