1.. SPDX-License-Identifier: BSD-3-Clause 2 Copyright(c) 2023 Intel Corporation. 3 4Using the AF_XDP driver in Kubernetes 5===================================== 6 7Introduction 8------------ 9 10Two infrastructure components are needed in order to provision a pod 11that is using the AF_XDP PMD in Kubernetes: 12 131. AF_XDP Device Plugin (DP). 142. AF_XDP Container Network Interface (CNI) binary. 15 16Both of these components are available through 17the `AF_XDP Device Plugin for Kubernetes`_ repository. 18 19The AF_XDP DP provisions and advertises networking interfaces to Kubernetes, 20while the CNI configures and plumbs network interfaces for the Pod. 21 22This document explains how to use the `AF_XDP Device Plugin for Kubernetes`_ 23with a DPDK application using the :doc:`../nics/af_xdp`. 24 25.. _AF_XDP Device Plugin for Kubernetes: https://github.com/redhat-et/afxdp-plugins-for-kubernetes 26 27 28Background 29---------- 30 31The standard :doc:`../nics/af_xdp` initialization process involves loading an eBPF program 32onto the kernel netdev to be used by the PMD. 33This operation requires root or escalated Linux privileges 34and thus prevents the PMD from working in an unprivileged container. 35The AF_XDP Device Plugin handles this situation 36by managing the eBPF program(s) on behalf of the Pod, outside of the pod context. 37 38At a technical level the AF_XDP Device Plugin opens a Unix Domain Socket (UDS) 39and listens for a client to make requests over that socket. 40A DPDK application acting as a client connects and initiates a configuration "handshake". 41After some validation on the Device Plugin side, 42the client receives a file descriptor which points to the XSKMAP 43associated with the loaded eBPF program. 44The XSKMAP is an eBPF map of AF_XDP sockets (XSK). 45The client can then proceed with creating an AF_XDP socket 46and inserting that socket into the XSKMAP pointed to by the descriptor. 47 48The EAL vdev argument ``use_cni`` is used to indicate that the user wishes 49to run the PMD in unprivileged mode and to receive the XSKMAP file descriptor 50from the CNI. 51When this flag is set, 52the ``XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD`` libbpf flag 53should be used when creating the socket 54to instruct libbpf not to load the default libbpf program on the netdev. 55Instead the loading is handled by the AF_XDP Device Plugin. 56 57The EAL vdev argument ``use_pinned_map`` is used indicate to the AF_XDP PMD 58to retrieve the XSKMAP fd from a pinned eBPF map. 59This map is expected to be pinned by an external entity like the AF_XDP Device Plugin. 60This enabled unprivileged pods to create and use AF_XDP sockets. 61When this flag is set, the ``XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD`` libbpf flag 62is used by the AF_XDP PMD when creating the AF_XDP socket. 63 64The EAL vdev argument ``dp_path`` is used alongside the ``use_cni`` or ``use_pinned_map`` 65arguments to explicitly tell the AF_XDP PMD where to find either: 66 671. The UDS to interact with the AF_XDP Device Plugin. OR 682. The pinned xskmap to use when creating AF_XDP sockets. 69 70If this argument is not passed alongside the ``use_cni`` or ``use_pinned_map`` arguments 71then the AF_XDP PMD configures it internally to the `AF_XDP Device Plugin for Kubernetes`_. 72 73.. note:: 74 75 DPDK AF_XDP PMD <= v23.11 will only work with 76 the AF_XDP Device Plugin <= commit id `38317c2`_. 77 78.. note:: 79 80 DPDK AF_XDP PMD > v23.11 will work with latest version of the AF_XDP Device Plugin 81 through a combination of the ``dp_path`` and/or the ``use_cni`` parameter. 82 In these versions of the PMD if a user doesn't explicitly set the ``dp_path`` parameter 83 when using ``use_cni`` then that path is transparently configured in the AF_XDP PMD 84 to the default `AF_XDP Device Plugin for Kubernetes`_ mount point path. 85 The path can be overridden by explicitly setting the ``dp_path`` param. 86 87.. note:: 88 89 DPDK AF_XDP PMD > v23.11 is backwards compatible 90 with (older) versions of the AF_XDP DP <= commit id `38317c2`_ 91 by explicitly setting ``dp_path`` to ``/tmp/afxdp.sock``. 92 93.. _38317c2: https://github.com/redhat-et/afxdp-plugins-for-kubernetes/commit/38317c256b5c7dfb39e013a0f76010c2ded03669 94 95Prerequisites 96------------- 97 98Device Plugin and DPDK container prerequisites: 99 100* Create a DPDK container image. 101 102* Set up the device plugin and prepare the Pod Spec as described in 103 the instructions for `AF_XDP Device Plugin for Kubernetes`_. 104 105* The Docker image should contain the libbpf and libxdp libraries, 106 which are dependencies for AF_XDP, 107 and should include support for the ``ethtool`` command. 108 109* The Pod should have enabled the capabilities 110 ``CAP_NET_RAW`` for AF_XDP socket creation, 111 ``IPC_LOCK`` for umem creation and 112 ``CAP_BPF`` (for Kernel < 5.19) along with support for hugepages. 113 114 .. note:: 115 116 For Kernel versions < 5.19, all BPF sys calls required CAP_BPF, 117 to access maps shared between the eBFP program and the userspace program. 118 Kernels >= 5.19, only requires CAP_BPF for map creation (BPF_MAP_CREATE) 119 and loading programs (BPF_PROG_LOAD). 120 121* Increase locked memory limit so containers have enough memory for packet buffers. 122 For example: 123 124 .. code-block:: console 125 126 cat << EOF | sudo tee /etc/systemd/system/containerd.service.d/limits.conf 127 [Service] 128 LimitMEMLOCK=infinity 129 EOF 130 131* dpdk-testpmd application should have AF_XDP feature enabled. 132 133 For further information see the docs for the: :doc:`../../nics/af_xdp`. 134 135 136Example 137------- 138 139Build a DPDK container image (using Docker) 140~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 141 1421. Create a Dockerfile (should be placed in top level DPDK directory): 143 144 .. code-block:: console 145 146 FROM fedora:38 147 148 # Setup container to build DPDK applications 149 RUN dnf -y upgrade && dnf -y install \ 150 libbsd-devel \ 151 numactl-libs \ 152 libbpf-devel \ 153 libbpf \ 154 meson \ 155 ninja-build \ 156 libxdp-devel \ 157 libxdp \ 158 numactl-devel \ 159 python3-pyelftools \ 160 python38 \ 161 iproute 162 RUN dnf groupinstall -y 'Development Tools' 163 164 # Create DPDK dir and copy over sources 165 # Create DPDK dir and copy over sources 166 COPY ./ /dpdk 167 WORKDIR /dpdk 168 169 # Build DPDK 170 RUN meson setup build 171 RUN ninja -C build 172 1732. Build a DPDK container image (using Docker) 174 175 .. code-block:: console 176 177 # docker build -t dpdk -f Dockerfile 178 179Run dpdk-testpmd with the AF_XDP Device Plugin + CNI 180~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 181 182* Clone the AF_XDP Device plugin and CNI 183 184 .. code-block:: console 185 186 # git clone https://github.com/redhat-et/afxdp-plugins-for-kubernetes.git 187 188 .. note:: 189 190 Ensure you have the AF_XDP Device Plugin + CNI prerequisites installed. 191 192* Build the AF_XDP Device plugin and CNI 193 194 .. code-block:: console 195 196 # cd afxdp-plugins-for-kubernetes/ 197 # make image 198 199* Make sure to modify the image used by the `daemonset.yml`_ file 200 in the deployments directory with the following configuration: 201 202 .. _daemonset.yml : https://github.com/redhat-et/afxdp-plugins-for-kubernetes/blob/main/deployments/daemonset.yml 203 204 .. code-block:: yaml 205 206 image: afxdp-device-plugin:latest 207 208 .. note:: 209 210 This will select the AF_XDP DP image that was built locally. 211 Detailed configuration options can be found in the AF_XDP Device Plugin `readme`_ . 212 213 .. _readme: https://github.com/redhat-et/afxdp-plugins-for-kubernetes#readme 214 215* Deploy the AF_XDP Device Plugin and CNI 216 217 .. code-block:: console 218 219 # kubectl create -f deployments/daemonset.yml 220 221* Create the Network Attachment definition 222 223 .. code-block:: console 224 225 # kubectl create -f nad.yaml 226 227 Sample nad.yml 228 229 .. code-block:: yaml 230 231 apiVersion: "k8s.cni.cncf.io/v1" 232 kind: NetworkAttachmentDefinition 233 metadata: 234 name: afxdp-network 235 annotations: 236 k8s.v1.cni.cncf.io/resourceName: afxdp/myPool 237 spec: 238 config: '{ 239 "cniVersion": "0.3.0", 240 "type": "afxdp", 241 "mode": "primary", 242 "logFile": "afxdp-cni.log", 243 "logLevel": "debug", 244 "ethtoolCmds" : ["-N -device- rx-flow-hash udp4 fn", 245 "-N -device- flow-type udp4 dst-port 2152 action 22" 246 ], 247 "ipam": { 248 "type": "host-local", 249 "subnet": "192.168.1.0/24", 250 "rangeStart": "192.168.1.200", 251 "rangeEnd": "192.168.1.220", 252 "routes": [ 253 { "dst": "0.0.0.0/0" } 254 ], 255 "gateway": "192.168.1.1" 256 } 257 }' 258 259 For further reference please use the example provided by the AF_XDP DP `nad.yaml`_ 260 261 .. _nad.yaml: https://github.com/redhat-et/afxdp-plugins-for-kubernetes/blob/main/examples/network-attachment-definition.yaml 262 263* Run the Pod 264 265 .. code-block:: console 266 267 # kubectl create -f pod.yaml 268 269 Sample pod.yaml: 270 271 .. code-block:: yaml 272 273 apiVersion: v1 274 kind: Pod 275 metadata: 276 name: dpdk 277 annotations: 278 k8s.v1.cni.cncf.io/networks: afxdp-network 279 spec: 280 containers: 281 - name: testpmd 282 image: dpdk:latest 283 command: ["tail", "-f", "/dev/null"] 284 securityContext: 285 capabilities: 286 add: 287 - NET_RAW 288 - IPC_LOCK 289 resources: 290 requests: 291 afxdp/myPool: '1' 292 limits: 293 hugepages-1Gi: 2Gi 294 cpu: 2 295 memory: 256Mi 296 afxdp/myPool: '1' 297 volumeMounts: 298 - name: hugepages 299 mountPath: /dev/hugepages 300 volumes: 301 - name: hugepages 302 emptyDir: 303 medium: HugePages 304 305 For further reference please see the `pod.yaml`_ 306 307 .. _pod.yaml: https://github.com/redhat-et/afxdp-plugins-for-kubernetes/blob/main/examples/pod-spec.yaml 308 309* Run DPDK with a command like the following: 310 311 .. code-block:: console 312 313 kubectl exec -i <Pod name> --container <containers name> -- \ 314 /<Path>/dpdk-testpmd -l 0,1 --no-pci \ 315 --vdev=net_af_xdp0,use_cni=1,iface=<interface name> \ 316 --no-mlockall --in-memory \ 317 -- -i --a --nb-cores=2 --rxq=1 --txq=1 --forward-mode=macswap; 318 319 Or 320 321 .. code-block:: console 322 323 kubectl exec -i <Pod name> --container <containers name> -- \ 324 /<Path>/dpdk-testpmd -l 0,1 --no-pci \ 325 --vdev=net_af_xdp0,use_cni=1,iface=<interface name>,dp_path="/tmp/afxdp_dp/<interface name>/afxdp.sock" \ 326 --no-mlockall --in-memory \ 327 -- -i --a --nb-cores=2 --rxq=1 --txq=1 --forward-mode=macswap; 328 329 Or 330 331 .. code-block:: console 332 333 kubectl exec -i <Pod name> --container <containers name> -- \ 334 /<Path>/dpdk-testpmd -l 0,1 --no-pci \ 335 --vdev=net_af_xdp0,use_pinned_map=1,iface=<interface name>,dp_path="/tmp/afxdp_dp/<interface name>/xsks_map" \ 336 --no-mlockall --in-memory \ 337 -- -i --a --nb-cores=2 --rxq=1 --txq=1 --forward-mode=macswap; 338 339.. note:: 340 341 If the ``dp_path`` parameter isn't explicitly set with ``use_cni`` or ``use_pinned_map`` 342 the AF_XDP PMD will set the parameter values to the `AF_XDP Device Plugin for Kubernetes`_ defaults. 343