xref: /dpdk/doc/guides/nics/ena.rst (revision 0525b496a4ecdc3b53b91fd36893d5d2541b056c)
1..  SPDX-License-Identifier: BSD-3-Clause
2    Copyright (c) 2015-2020 Amazon.com, Inc. or its affiliates.
3    All rights reserved.
4
5ENA Poll Mode Driver
6====================
7
8The ENA PMD is a DPDK poll-mode driver for the Amazon Elastic
9Network Adapter (ENA) family.
10
11Supported ENA adapters
12----------------------
13
14Current ENA PMD supports the following ENA adapters including:
15
16* ``1d0f:ec20`` - ENA VF
17* ``1d0f:ec21`` - ENA VF RSERV0
18
19Supported features
20------------------
21
22* MTU configuration
23* Jumbo frames up to 9K
24* IPv4/TCP/UDP checksum offload
25* TSO offload
26* Multiple receive and transmit queues
27* RSS hash
28* RSS indirection table configuration
29* Low Latency Queue for Tx
30* Basic and extended statistics
31* LSC event notification
32* Watchdog (requires handling of timers in the application)
33* Device reset upon failure
34* Rx interrupts
35
36Overview
37--------
38
39The ENA driver exposes a lightweight management interface with a
40minimal set of memory mapped registers and an extendable command set
41through an Admin Queue.
42
43The driver supports a wide range of ENA adapters, is link-speed
44independent (i.e., the same driver is used for 10GbE, 25GbE, 40GbE,
45etc.), and it negotiates and supports an extendable feature set.
46
47ENA adapters allow high speed and low overhead Ethernet traffic
48processing by providing a dedicated Tx/Rx queue pair per CPU core.
49
50The ENA driver supports industry standard TCP/IP offload features such
51as checksum offload and TCP transmit segmentation offload (TSO).
52
53Receive-side scaling (RSS) is supported for multi-core scaling.
54
55Some of the ENA devices support a working mode called Low-latency
56Queue (LLQ), which saves several more microseconds.
57
58Management Interface
59--------------------
60
61ENA management interface is exposed by means of:
62
63* Device Registers
64* Admin Queue (AQ) and Admin Completion Queue (ACQ)
65
66ENA device memory-mapped PCIe space for registers (MMIO registers)
67are accessed only during driver initialization and are not involved
68in further normal device operation.
69
70AQ is used for submitting management commands, and the
71results/responses are reported asynchronously through ACQ.
72
73ENA introduces a very small set of management commands with room for
74vendor-specific extensions. Most of the management operations are
75framed in a generic Get/Set feature command.
76
77The following admin queue commands are supported:
78
79* Create I/O submission queue
80* Create I/O completion queue
81* Destroy I/O submission queue
82* Destroy I/O completion queue
83* Get feature
84* Set feature
85* Get statistics
86
87Refer to ``ena_admin_defs.h`` for the list of supported Get/Set Feature
88properties.
89
90Data Path Interface
91-------------------
92
93I/O operations are based on Tx and Rx Submission Queues (Tx SQ and Rx
94SQ correspondingly). Each SQ has a completion queue (CQ) associated
95with it.
96
97The SQs and CQs are implemented as descriptor rings in contiguous
98physical memory.
99
100Refer to ``ena_eth_io_defs.h`` for the detailed structure of the descriptor
101
102The driver supports multi-queue for both Tx and Rx.
103
104Configuration
105-------------
106
107Runtime Configuration
108^^^^^^^^^^^^^^^^^^^^^
109
110   * **llq_policy** (default 1)
111
112     Controls whether use device recommended header policy or override it:
113
114     0 - Disable LLQ (Use with extreme caution as it leads to a huge performance
115     degradation on AWS instances built with Nitro v4 onwards).
116
117     1 - Accept device recommended LLQ policy (Default).
118
119     2 - Enforce normal LLQ policy.
120
121     3 - Enforce large LLQ policy.
122
123   * **miss_txc_to** (default 5)
124
125     Number of seconds after which the Tx packet will be considered missing.
126     If the missing packets number will exceed dynamically calculated threshold,
127     the driver will trigger the device reset which should be handled by the
128     application. Checking for missing Tx completions happens in the driver's
129     timer service. Setting this parameter to 0 disables this feature. Maximum
130     allowed value is 60 seconds.
131
132   * **control_poll_interval** (default 0)
133
134     Enable polling-based functionality of the admin queues,
135     eliminating the need for interrupts in the control-path:
136
137     0 - Disable (Admin queue will work in interrupt mode).
138
139     [1..1000] - Number of milliseconds to wait between periodic inspection of the admin queues.
140
141     **A non-zero value for this devarg is mandatory for control path functionality
142     when binding ports to uio_pci_generic kernel module which lacks interrupt support.**
143
144
145ENA Configuration Parameters
146^^^^^^^^^^^^^^^^^^^^^^^^^^^^
147
148   * **Number of Queues**
149
150     This is the requested number of queues upon initialization, however, the actual
151     number of receive and transmit queues to be created will be the minimum between
152     the maximal number supported by the device and number of queues requested.
153
154   * **Size of Queues**
155
156     This is the requested size of receive/transmit queues, while the actual size
157     will be the minimum between the requested size and the maximal receive/transmit
158     supported by the device.
159
160Building DPDK
161-------------
162
163See the :ref:`DPDK Getting Started Guide for Linux <linux_gsg>` for
164instructions on how to build DPDK.
165
166By default the ENA PMD library will be built into the DPDK library.
167
168For configuring and using UIO and VFIO frameworks, please also refer :ref:`the
169documentation that comes with DPDK suite <linux_gsg>`.
170
171Supported Operating Systems
172---------------------------
173
174Any Linux distribution fulfilling the conditions described in ``System Requirements``
175section of :ref:`the DPDK documentation <linux_gsg>` or refer to *DPDK Release Notes*.
176
177Prerequisites
178-------------
179
180#. Prepare the system as recommended by DPDK suite.  This includes environment
181   variables, hugepages configuration, tool-chains and configuration.
182
183#. ENA PMD can operate with ``vfio-pci`` (*), ``igb_uio``, or ``uio_pci_generic`` driver.
184
185   (*) ENAv2 hardware supports Low Latency Queue v2 (LLQv2). This feature
186   reduces the latency of the packets by pushing the header directly through
187   the PCI to the device, before the DMA is even triggered. For proper work
188   kernel PCI driver must support write-combining (WC).
189   In DPDK ``igb_uio`` it must be enabled by loading module with
190   ``wc_activate=1`` flag (example below). However, mainline's vfio-pci
191   driver in kernel doesn't have WC support yet (planned to be added).
192   If vfio-pci is used user should follow `AWS ENA PMD documentation
193   <https://github.com/amzn/amzn-drivers/tree/master/userspace/dpdk/README.md>`_.
194
195#. For ``igb_uio``:
196   Insert ``igb_uio`` kernel module using the command ``modprobe uio; insmod igb_uio.ko wc_activate=1``
197
198#. For ``vfio-pci``:
199   Insert ``vfio-pci`` kernel module using the command ``modprobe vfio-pci``
200   Please make sure that ``IOMMU`` is enabled in your system,
201   or use ``vfio`` driver in ``noiommu`` mode::
202
203     echo 1 > /sys/module/vfio/parameters/enable_unsafe_noiommu_mode
204
205   To use ``noiommu`` mode, the ``vfio-pci`` must be built with flag
206   ``CONFIG_VFIO_NOIOMMU``.
207
208#. For ``uio_pci_generic``:
209   Insert ``uio_pci_generic`` kernel module using the command ``modprobe uio_pci_generic``.
210   Make sure that the IOMMU is disabled or is in passthrough mode.
211   For example: ``modprobe uio_pci_generic intel_iommu=off``.
212
213   Note that when launching the application,
214   the ``control_poll_interval`` devarg must be used with a non-zero value (1000 is recommended)
215   as ``uio_pci_generic`` lacks interrupt support.
216   The control-path (admin queues) of the ENA requires poll-mode
217   to process command completion and asynchronous notification from the device.
218   For example: ``dpdk-app -a "00:06.0,control_path_poll_interval=1000"``.
219
220#. Bind the intended ENA device to ``vfio-pci``, ``igb_uio``, or ``uio_pci_generic`` module.
221
222At this point the system should be ready to run DPDK applications. Once the
223application runs to completion, the ENA can be detached from attached module if
224necessary.
225
226**Rx interrupts support**
227
228ENA PMD supports Rx interrupts, which can be used to wake up lcores waiting for input.
229Please note that it won't work with ``igb_uio`` and ``uio_pci_generic``
230so to use this feature, the ``vfio-pci`` should be used.
231
232ENA handles admin interrupts and AENQ notifications on separate interrupt.
233There is possibility that there won't be enough event file descriptors to
234handle both admin and Rx interrupts. In that situation the Rx interrupt request
235will fail.
236
237**Note about usage on \*.metal instances**
238
239On AWS, the metal instances are supporting IOMMU for both arm64 and x86_64 hosts.
240Note that ``uio_pci_generic`` lacks IOMMU support and cannot be used for metal instances.
241
242* x86_64 (e.g. c5.metal, i3.metal):
243   IOMMU should be disabled by default. In that situation, the ``igb_uio`` can
244   be used as it is but ``vfio-pci`` should be working in no-IOMMU mode (please
245   see above).
246
247   When IOMMU is enabled, ``igb_uio`` cannot be used as it's not supporting this
248   feature, while ``vfio-pci`` should work without any changes.
249   To enable IOMMU on those hosts, please update ``GRUB_CMDLINE_LINUX`` in file
250   ``/etc/default/grub`` with the below extra boot arguments::
251
252    iommu=1 intel_iommu=on
253
254   Then, make the changes live by executing as a root::
255
256    # grub2-mkconfig > /boot/grub2/grub.cfg
257
258   Finally, reboot should result in IOMMU being enabled.
259
260* arm64 (a1.metal):
261   IOMMU should be enabled by default. Unfortunately, ``vfio-pci`` isn't
262   supporting SMMU, which is implementation of IOMMU for arm64 architecture and
263   ``igb_uio`` isn't supporting IOMMU at all, so to use DPDK with ENA on those
264   hosts, one must disable IOMMU. This can be done by updating
265   ``GRUB_CMDLINE_LINUX`` in file ``/etc/default/grub`` with the extra boot
266   argument::
267
268    iommu.passthrough=1
269
270   Then, make the changes live by executing as a root::
271
272    # grub2-mkconfig > /boot/grub2/grub.cfg
273
274   Finally, reboot should result in IOMMU being disabled.
275   Without IOMMU, ``igb_uio`` can be used as it is but ``vfio-pci`` should be
276   working in no-IOMMU mode (please see above).
277
278Usage example
279-------------
280
281Follow instructions available in the document
282:ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>` to launch
283**testpmd** with Amazon ENA devices managed by librte_net_ena.
284
285Example output:
286
287.. code-block:: console
288
289   [...]
290   EAL: PCI device 0000:00:06.0 on NUMA socket -1
291   EAL: Device 0000:00:06.0 is not NUMA-aware, defaulting socket to 0
292   EAL:   probe driver: 1d0f:ec20 net_ena
293
294   Interactive-mode selected
295   testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=171456, size=2176, socket=0
296   testpmd: preferred mempool ops selected: ring_mp_mc
297   Warning! port-topology=paired and odd forward ports number, the last port will pair with itself.
298   Configuring Port 0 (socket 0)
299   Port 0: 00:00:00:11:00:01
300   Checking link statuses...
301
302   Done
303   testpmd>
304