1.. SPDX-License-Identifier: BSD-3-Clause 2 Copyright (c) 2015-2020 Amazon.com, Inc. or its affiliates. 3 All rights reserved. 4 5ENA Poll Mode Driver 6==================== 7 8The ENA PMD is a DPDK poll-mode driver for the Amazon Elastic 9Network Adapter (ENA) family. 10 11Supported ENA adapters 12---------------------- 13 14Current ENA PMD supports the following ENA adapters including: 15 16* ``1d0f:ec20`` - ENA VF 17* ``1d0f:ec21`` - ENA VF RSERV0 18 19Supported features 20------------------ 21 22* MTU configuration 23* Jumbo frames up to 9K 24* IPv4/TCP/UDP checksum offload 25* TSO offload 26* Multiple receive and transmit queues 27* RSS hash 28* RSS indirection table configuration 29* Low Latency Queue for Tx 30* Basic and extended statistics 31* LSC event notification 32* Watchdog (requires handling of timers in the application) 33* Device reset upon failure 34* Rx interrupts 35 36Overview 37-------- 38 39The ENA driver exposes a lightweight management interface with a 40minimal set of memory mapped registers and an extendable command set 41through an Admin Queue. 42 43The driver supports a wide range of ENA adapters, is link-speed 44independent (i.e., the same driver is used for 10GbE, 25GbE, 40GbE, 45etc.), and it negotiates and supports an extendable feature set. 46 47ENA adapters allow high speed and low overhead Ethernet traffic 48processing by providing a dedicated Tx/Rx queue pair per CPU core. 49 50The ENA driver supports industry standard TCP/IP offload features such 51as checksum offload and TCP transmit segmentation offload (TSO). 52 53Receive-side scaling (RSS) is supported for multi-core scaling. 54 55Some of the ENA devices support a working mode called Low-latency 56Queue (LLQ), which saves several more microseconds. 57 58Management Interface 59-------------------- 60 61ENA management interface is exposed by means of: 62 63* Device Registers 64* Admin Queue (AQ) and Admin Completion Queue (ACQ) 65 66ENA device memory-mapped PCIe space for registers (MMIO registers) 67are accessed only during driver initialization and are not involved 68in further normal device operation. 69 70AQ is used for submitting management commands, and the 71results/responses are reported asynchronously through ACQ. 72 73ENA introduces a very small set of management commands with room for 74vendor-specific extensions. Most of the management operations are 75framed in a generic Get/Set feature command. 76 77The following admin queue commands are supported: 78 79* Create I/O submission queue 80* Create I/O completion queue 81* Destroy I/O submission queue 82* Destroy I/O completion queue 83* Get feature 84* Set feature 85* Get statistics 86 87Refer to ``ena_admin_defs.h`` for the list of supported Get/Set Feature 88properties. 89 90Data Path Interface 91------------------- 92 93I/O operations are based on Tx and Rx Submission Queues (Tx SQ and Rx 94SQ correspondingly). Each SQ has a completion queue (CQ) associated 95with it. 96 97The SQs and CQs are implemented as descriptor rings in contiguous 98physical memory. 99 100Refer to ``ena_eth_io_defs.h`` for the detailed structure of the descriptor 101 102The driver supports multi-queue for both Tx and Rx. 103 104Configuration 105------------- 106 107Runtime Configuration 108^^^^^^^^^^^^^^^^^^^^^ 109 110 * **llq_policy** (default 1) 111 112 Controls whether use device recommended header policy or override it: 113 114 0 - Disable LLQ (Use with extreme caution as it leads to a huge performance 115 degradation on AWS instances built with Nitro v4 onwards). 116 117 1 - Accept device recommended LLQ policy (Default). 118 119 2 - Enforce normal LLQ policy. 120 121 3 - Enforce large LLQ policy. 122 123 * **miss_txc_to** (default 5) 124 125 Number of seconds after which the Tx packet will be considered missing. 126 If the missing packets number will exceed dynamically calculated threshold, 127 the driver will trigger the device reset which should be handled by the 128 application. Checking for missing Tx completions happens in the driver's 129 timer service. Setting this parameter to 0 disables this feature. Maximum 130 allowed value is 60 seconds. 131 132 * **control_poll_interval** (default 0) 133 134 Enable polling-based functionality of the admin queues, 135 eliminating the need for interrupts in the control-path: 136 137 0 - Disable (Admin queue will work in interrupt mode). 138 139 [1..1000] - Number of milliseconds to wait between periodic inspection of the admin queues. 140 141 **A non-zero value for this devarg is mandatory for control path functionality 142 when binding ports to uio_pci_generic kernel module which lacks interrupt support.** 143 144 145ENA Configuration Parameters 146^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 147 148 * **Number of Queues** 149 150 This is the requested number of queues upon initialization, however, the actual 151 number of receive and transmit queues to be created will be the minimum between 152 the maximal number supported by the device and number of queues requested. 153 154 * **Size of Queues** 155 156 This is the requested size of receive/transmit queues, while the actual size 157 will be the minimum between the requested size and the maximal receive/transmit 158 supported by the device. 159 160Building DPDK 161------------- 162 163See the :ref:`DPDK Getting Started Guide for Linux <linux_gsg>` for 164instructions on how to build DPDK. 165 166By default the ENA PMD library will be built into the DPDK library. 167 168For configuring and using UIO and VFIO frameworks, please also refer :ref:`the 169documentation that comes with DPDK suite <linux_gsg>`. 170 171Supported Operating Systems 172--------------------------- 173 174Any Linux distribution fulfilling the conditions described in ``System Requirements`` 175section of :ref:`the DPDK documentation <linux_gsg>` or refer to *DPDK Release Notes*. 176 177Prerequisites 178------------- 179 180#. Prepare the system as recommended by DPDK suite. This includes environment 181 variables, hugepages configuration, tool-chains and configuration. 182 183#. ENA PMD can operate with ``vfio-pci`` (*), ``igb_uio``, or ``uio_pci_generic`` driver. 184 185 (*) ENAv2 hardware supports Low Latency Queue v2 (LLQv2). This feature 186 reduces the latency of the packets by pushing the header directly through 187 the PCI to the device, before the DMA is even triggered. For proper work 188 kernel PCI driver must support write-combining (WC). 189 In DPDK ``igb_uio`` it must be enabled by loading module with 190 ``wc_activate=1`` flag (example below). However, mainline's vfio-pci 191 driver in kernel doesn't have WC support yet (planned to be added). 192 If vfio-pci is used user should follow `AWS ENA PMD documentation 193 <https://github.com/amzn/amzn-drivers/tree/master/userspace/dpdk/README.md>`_. 194 195#. For ``igb_uio``: 196 Insert ``igb_uio`` kernel module using the command ``modprobe uio; insmod igb_uio.ko wc_activate=1`` 197 198#. For ``vfio-pci``: 199 Insert ``vfio-pci`` kernel module using the command ``modprobe vfio-pci`` 200 Please make sure that ``IOMMU`` is enabled in your system, 201 or use ``vfio`` driver in ``noiommu`` mode:: 202 203 echo 1 > /sys/module/vfio/parameters/enable_unsafe_noiommu_mode 204 205 To use ``noiommu`` mode, the ``vfio-pci`` must be built with flag 206 ``CONFIG_VFIO_NOIOMMU``. 207 208#. For ``uio_pci_generic``: 209 Insert ``uio_pci_generic`` kernel module using the command ``modprobe uio_pci_generic``. 210 Make sure that the IOMMU is disabled or is in passthrough mode. 211 For example: ``modprobe uio_pci_generic intel_iommu=off``. 212 213 Note that when launching the application, 214 the ``control_poll_interval`` devarg must be used with a non-zero value (1000 is recommended) 215 as ``uio_pci_generic`` lacks interrupt support. 216 The control-path (admin queues) of the ENA requires poll-mode 217 to process command completion and asynchronous notification from the device. 218 For example: ``dpdk-app -a "00:06.0,control_path_poll_interval=1000"``. 219 220#. Bind the intended ENA device to ``vfio-pci``, ``igb_uio``, or ``uio_pci_generic`` module. 221 222At this point the system should be ready to run DPDK applications. Once the 223application runs to completion, the ENA can be detached from attached module if 224necessary. 225 226**Rx interrupts support** 227 228ENA PMD supports Rx interrupts, which can be used to wake up lcores waiting for input. 229Please note that it won't work with ``igb_uio`` and ``uio_pci_generic`` 230so to use this feature, the ``vfio-pci`` should be used. 231 232ENA handles admin interrupts and AENQ notifications on separate interrupt. 233There is possibility that there won't be enough event file descriptors to 234handle both admin and Rx interrupts. In that situation the Rx interrupt request 235will fail. 236 237**Note about usage on \*.metal instances** 238 239On AWS, the metal instances are supporting IOMMU for both arm64 and x86_64 hosts. 240Note that ``uio_pci_generic`` lacks IOMMU support and cannot be used for metal instances. 241 242* x86_64 (e.g. c5.metal, i3.metal): 243 IOMMU should be disabled by default. In that situation, the ``igb_uio`` can 244 be used as it is but ``vfio-pci`` should be working in no-IOMMU mode (please 245 see above). 246 247 When IOMMU is enabled, ``igb_uio`` cannot be used as it's not supporting this 248 feature, while ``vfio-pci`` should work without any changes. 249 To enable IOMMU on those hosts, please update ``GRUB_CMDLINE_LINUX`` in file 250 ``/etc/default/grub`` with the below extra boot arguments:: 251 252 iommu=1 intel_iommu=on 253 254 Then, make the changes live by executing as a root:: 255 256 # grub2-mkconfig > /boot/grub2/grub.cfg 257 258 Finally, reboot should result in IOMMU being enabled. 259 260* arm64 (a1.metal): 261 IOMMU should be enabled by default. Unfortunately, ``vfio-pci`` isn't 262 supporting SMMU, which is implementation of IOMMU for arm64 architecture and 263 ``igb_uio`` isn't supporting IOMMU at all, so to use DPDK with ENA on those 264 hosts, one must disable IOMMU. This can be done by updating 265 ``GRUB_CMDLINE_LINUX`` in file ``/etc/default/grub`` with the extra boot 266 argument:: 267 268 iommu.passthrough=1 269 270 Then, make the changes live by executing as a root:: 271 272 # grub2-mkconfig > /boot/grub2/grub.cfg 273 274 Finally, reboot should result in IOMMU being disabled. 275 Without IOMMU, ``igb_uio`` can be used as it is but ``vfio-pci`` should be 276 working in no-IOMMU mode (please see above). 277 278Usage example 279------------- 280 281Follow instructions available in the document 282:ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>` to launch 283**testpmd** with Amazon ENA devices managed by librte_net_ena. 284 285Example output: 286 287.. code-block:: console 288 289 [...] 290 EAL: PCI device 0000:00:06.0 on NUMA socket -1 291 EAL: Device 0000:00:06.0 is not NUMA-aware, defaulting socket to 0 292 EAL: probe driver: 1d0f:ec20 net_ena 293 294 Interactive-mode selected 295 testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=171456, size=2176, socket=0 296 testpmd: preferred mempool ops selected: ring_mp_mc 297 Warning! port-topology=paired and odd forward ports number, the last port will pair with itself. 298 Configuring Port 0 (socket 0) 299 Port 0: 00:00:00:11:00:01 300 Checking link statuses... 301 302 Done 303 testpmd> 304