xref: /dpdk/doc/guides/howto/af_xdp_dp.rst (revision ffcfbfc89ad27c5155858bee1a7b50a512573744)
1.. SPDX-License-Identifier: BSD-3-Clause
2   Copyright(c) 2023 Intel Corporation.
3
4Using the AF_XDP driver in Kubernetes
5=====================================
6
7Introduction
8------------
9
10Two infrastructure components are needed in order to provision a pod
11that is using the AF_XDP PMD in Kubernetes:
12
131. AF_XDP Device Plugin (DP).
142. AF_XDP Container Network Interface (CNI) binary.
15
16Both of these components are available through
17the `AF_XDP Device Plugin for Kubernetes`_ repository.
18
19The AF_XDP DP provisions and advertises networking interfaces to Kubernetes,
20while the CNI configures and plumbs network interfaces for the Pod.
21
22This document explains how to use the `AF_XDP Device Plugin for Kubernetes`_
23with a DPDK application using the :doc:`../nics/af_xdp`.
24
25.. _AF_XDP Device Plugin for Kubernetes: https://github.com/redhat-et/afxdp-plugins-for-kubernetes
26
27
28Background
29----------
30
31The standard :doc:`../nics/af_xdp` initialization process involves loading an eBPF program
32onto the kernel netdev to be used by the PMD.
33This operation requires root or escalated Linux privileges
34and thus prevents the PMD from working in an unprivileged container.
35The AF_XDP Device Plugin handles this situation
36by managing the eBPF program(s) on behalf of the Pod, outside of the pod context.
37
38At a technical level the AF_XDP Device Plugin opens a Unix Domain Socket (UDS)
39and listens for a client to make requests over that socket.
40A DPDK application acting as a client connects and initiates a configuration "handshake".
41After some validation on the Device Plugin side,
42the client receives a file descriptor which points to the XSKMAP
43associated with the loaded eBPF program.
44The XSKMAP is an eBPF map of AF_XDP sockets (XSK).
45The client can then proceed with creating an AF_XDP socket
46and inserting that socket into the XSKMAP pointed to by the descriptor.
47
48The EAL vdev argument ``use_cni`` is used to indicate that the user wishes
49to run the PMD in unprivileged mode and to receive the XSKMAP file descriptor
50from the CNI.
51When this flag is set,
52the ``XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD`` libbpf flag
53should be used when creating the socket
54to instruct libbpf not to load the default libbpf program on the netdev.
55Instead the loading is handled by the AF_XDP Device Plugin.
56
57The EAL vdev argument ``use_pinned_map`` is used indicate to the AF_XDP PMD
58to retrieve the XSKMAP fd from a pinned eBPF map.
59This map is expected to be pinned by an external entity like the AF_XDP Device Plugin.
60This enabled unprivileged pods to create and use AF_XDP sockets.
61When this flag is set, the ``XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD`` libbpf flag
62is used by the AF_XDP PMD when creating the AF_XDP socket.
63
64The EAL vdev argument ``dp_path`` is used alongside the ``use_cni`` or ``use_pinned_map``
65arguments to explicitly tell the AF_XDP PMD where to find either:
66
671. The UDS to interact with the AF_XDP Device Plugin. OR
682. The pinned xskmap to use when creating AF_XDP sockets.
69
70If this argument is not passed alongside the ``use_cni`` or ``use_pinned_map`` arguments
71then the AF_XDP PMD configures it internally to the `AF_XDP Device Plugin for Kubernetes`_.
72
73.. note::
74
75   DPDK AF_XDP PMD <= v23.11 will only work with
76   the AF_XDP Device Plugin <= commit id `38317c2`_.
77
78.. note::
79
80   DPDK AF_XDP PMD > v23.11 will work with latest version of the AF_XDP Device Plugin
81   through a combination of the ``dp_path`` and/or the ``use_cni`` parameter.
82   In these versions of the PMD if a user doesn't explicitly set the ``dp_path`` parameter
83   when using ``use_cni`` then that path is transparently configured in the AF_XDP PMD
84   to the default `AF_XDP Device Plugin for Kubernetes`_ mount point path.
85   The path can be overridden by explicitly setting the ``dp_path`` param.
86
87.. note::
88
89   DPDK AF_XDP PMD > v23.11 is backwards compatible
90   with (older) versions of the AF_XDP DP <= commit id `38317c2`_
91   by explicitly setting ``dp_path`` to ``/tmp/afxdp.sock``.
92
93.. _38317c2: https://github.com/redhat-et/afxdp-plugins-for-kubernetes/commit/38317c256b5c7dfb39e013a0f76010c2ded03669
94
95Prerequisites
96-------------
97
98Device Plugin and DPDK container prerequisites:
99
100* Create a DPDK container image.
101
102* Set up the device plugin and prepare the Pod Spec as described in
103  the instructions for `AF_XDP Device Plugin for Kubernetes`_.
104
105* The Docker image should contain the libbpf and libxdp libraries,
106  which are dependencies for AF_XDP,
107  and should include support for the ``ethtool`` command.
108
109* The Pod should have enabled the capabilities
110  ``CAP_NET_RAW`` for AF_XDP socket creation,
111  ``IPC_LOCK`` for umem creation and
112  ``CAP_BPF`` (for Kernel < 5.19) along with support for hugepages.
113
114  .. note::
115
116     For Kernel versions < 5.19, all BPF sys calls required CAP_BPF,
117     to access maps shared between the eBFP program and the userspace program.
118     Kernels >= 5.19, only requires CAP_BPF for map creation (BPF_MAP_CREATE)
119     and loading programs (BPF_PROG_LOAD).
120
121* Increase locked memory limit so containers have enough memory for packet buffers.
122  For example:
123
124  .. code-block:: console
125
126     cat << EOF | sudo tee /etc/systemd/system/containerd.service.d/limits.conf
127     [Service]
128     LimitMEMLOCK=infinity
129     EOF
130
131* dpdk-testpmd application should have AF_XDP feature enabled.
132
133  For further information see the docs for the: :doc:`../../nics/af_xdp`.
134
135
136Example
137-------
138
139Build a DPDK container image (using Docker)
140~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
141
1421. Create a Dockerfile (should be placed in top level DPDK directory):
143
144   .. code-block:: console
145
146      FROM fedora:38
147
148      # Setup container to build DPDK applications
149      RUN dnf -y upgrade && dnf -y install \
150          libbsd-devel \
151          numactl-libs \
152          libbpf-devel \
153          libbpf \
154          meson \
155          ninja-build \
156          libxdp-devel \
157          libxdp \
158          numactl-devel \
159          python3-pyelftools \
160          python38 \
161          iproute
162      RUN dnf groupinstall -y 'Development Tools'
163
164      # Create DPDK dir and copy over sources
165      # Create DPDK dir and copy over sources
166      COPY ./ /dpdk
167      WORKDIR /dpdk
168
169      # Build DPDK
170      RUN meson setup build
171      RUN ninja -C build
172
1732. Build a DPDK container image (using Docker)
174
175   .. code-block:: console
176
177      # docker build -t dpdk -f Dockerfile
178
179Run dpdk-testpmd with the AF_XDP Device Plugin + CNI
180~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
181
182* Clone the AF_XDP Device plugin and CNI
183
184  .. code-block:: console
185
186     # git clone https://github.com/redhat-et/afxdp-plugins-for-kubernetes.git
187
188  .. note::
189
190     Ensure you have the AF_XDP Device Plugin + CNI prerequisites installed.
191
192* Build the AF_XDP Device plugin and CNI
193
194  .. code-block:: console
195
196     # cd afxdp-plugins-for-kubernetes/
197     # make image
198
199* Make sure to modify the image used by the `daemonset.yml`_ file
200  in the deployments directory with the following configuration:
201
202  .. _daemonset.yml : https://github.com/redhat-et/afxdp-plugins-for-kubernetes/blob/main/deployments/daemonset.yml
203
204  .. code-block:: yaml
205
206     image: afxdp-device-plugin:latest
207
208  .. note::
209
210     This will select the AF_XDP DP image that was built locally.
211     Detailed configuration options can be found in the AF_XDP Device Plugin `readme`_ .
212
213  .. _readme: https://github.com/redhat-et/afxdp-plugins-for-kubernetes#readme
214
215* Deploy the AF_XDP Device Plugin and CNI
216
217  .. code-block:: console
218
219     # kubectl create -f deployments/daemonset.yml
220
221* Create the Network Attachment definition
222
223  .. code-block:: console
224
225     # kubectl create -f nad.yaml
226
227  Sample nad.yml
228
229  .. code-block:: yaml
230
231     apiVersion: "k8s.cni.cncf.io/v1"
232     kind: NetworkAttachmentDefinition
233     metadata:
234       name: afxdp-network
235       annotations:
236         k8s.v1.cni.cncf.io/resourceName: afxdp/myPool
237     spec:
238       config: '{
239           "cniVersion": "0.3.0",
240           "type": "afxdp",
241           "mode": "primary",
242           "logFile": "afxdp-cni.log",
243           "logLevel": "debug",
244           "ethtoolCmds" : ["-N -device- rx-flow-hash udp4 fn",
245                            "-N -device- flow-type udp4 dst-port 2152 action 22"
246                         ],
247           "ipam": {
248             "type": "host-local",
249             "subnet": "192.168.1.0/24",
250             "rangeStart": "192.168.1.200",
251             "rangeEnd": "192.168.1.220",
252             "routes": [
253               { "dst": "0.0.0.0/0" }
254             ],
255             "gateway": "192.168.1.1"
256           }
257         }'
258
259  For further reference please use the example provided by the AF_XDP DP `nad.yaml`_
260
261  .. _nad.yaml: https://github.com/redhat-et/afxdp-plugins-for-kubernetes/blob/main/examples/network-attachment-definition.yaml
262
263* Run the Pod
264
265  .. code-block:: console
266
267     # kubectl create -f pod.yaml
268
269  Sample pod.yaml:
270
271  .. code-block:: yaml
272
273     apiVersion: v1
274     kind: Pod
275     metadata:
276      name: dpdk
277      annotations:
278        k8s.v1.cni.cncf.io/networks: afxdp-network
279     spec:
280       containers:
281       - name: testpmd
282         image: dpdk:latest
283         command: ["tail", "-f", "/dev/null"]
284         securityContext:
285           capabilities:
286             add:
287               - NET_RAW
288               - IPC_LOCK
289         resources:
290           requests:
291             afxdp/myPool: '1'
292           limits:
293             hugepages-1Gi: 2Gi
294             cpu: 2
295             memory: 256Mi
296             afxdp/myPool: '1'
297         volumeMounts:
298         - name: hugepages
299           mountPath: /dev/hugepages
300       volumes:
301       - name: hugepages
302         emptyDir:
303           medium: HugePages
304
305  For further reference please see the `pod.yaml`_
306
307  .. _pod.yaml: https://github.com/redhat-et/afxdp-plugins-for-kubernetes/blob/main/examples/pod-spec.yaml
308
309* Run DPDK with a command like the following:
310
311  .. code-block:: console
312
313     kubectl exec -i <Pod name> --container <containers name> -- \
314           /<Path>/dpdk-testpmd -l 0,1 --no-pci \
315           --vdev=net_af_xdp0,use_cni=1,iface=<interface name> \
316           --no-mlockall --in-memory \
317           -- -i --a --nb-cores=2 --rxq=1 --txq=1 --forward-mode=macswap;
318
319  Or
320
321  .. code-block:: console
322
323     kubectl exec -i <Pod name> --container <containers name> -- \
324           /<Path>/dpdk-testpmd -l 0,1 --no-pci \
325           --vdev=net_af_xdp0,use_cni=1,iface=<interface name>,dp_path="/tmp/afxdp_dp/<interface name>/afxdp.sock" \
326           --no-mlockall --in-memory \
327           -- -i --a --nb-cores=2 --rxq=1 --txq=1 --forward-mode=macswap;
328
329  Or
330
331  .. code-block:: console
332
333     kubectl exec -i <Pod name> --container <containers name> -- \
334           /<Path>/dpdk-testpmd -l 0,1 --no-pci \
335           --vdev=net_af_xdp0,use_pinned_map=1,iface=<interface name>,dp_path="/tmp/afxdp_dp/<interface name>/xsks_map" \
336           --no-mlockall --in-memory \
337           -- -i --a --nb-cores=2 --rxq=1 --txq=1 --forward-mode=macswap;
338
339.. note::
340
341   If the ``dp_path`` parameter isn't explicitly set with ``use_cni`` or ``use_pinned_map``
342   the AF_XDP PMD will set the parameter values to the `AF_XDP Device Plugin for Kubernetes`_ defaults.
343