xref: /dpdk/doc/guides/nics/ena.rst (revision 0525b496a4ecdc3b53b91fd36893d5d2541b056c)
1702928afSMaciej Bielski..  SPDX-License-Identifier: BSD-3-Clause
28a7a73f2SMichal Krawczyk    Copyright (c) 2015-2020 Amazon.com, Inc. or its affiliates.
3cf8a122cSJan Medala    All rights reserved.
4cf8a122cSJan Medala
5cf8a122cSJan MedalaENA Poll Mode Driver
6cf8a122cSJan Medala====================
7cf8a122cSJan Medala
8cf8a122cSJan MedalaThe ENA PMD is a DPDK poll-mode driver for the Amazon Elastic
9cf8a122cSJan MedalaNetwork Adapter (ENA) family.
10cf8a122cSJan Medala
11b583b9a1SFerruh YigitSupported ENA adapters
12b583b9a1SFerruh Yigit----------------------
13b583b9a1SFerruh Yigit
14b583b9a1SFerruh YigitCurrent ENA PMD supports the following ENA adapters including:
15b583b9a1SFerruh Yigit
16b583b9a1SFerruh Yigit* ``1d0f:ec20`` - ENA VF
17b583b9a1SFerruh Yigit* ``1d0f:ec21`` - ENA VF RSERV0
18b583b9a1SFerruh Yigit
19b583b9a1SFerruh YigitSupported features
20b583b9a1SFerruh Yigit------------------
21b583b9a1SFerruh Yigit
22b583b9a1SFerruh Yigit* MTU configuration
23b583b9a1SFerruh Yigit* Jumbo frames up to 9K
24b583b9a1SFerruh Yigit* IPv4/TCP/UDP checksum offload
25b583b9a1SFerruh Yigit* TSO offload
26b583b9a1SFerruh Yigit* Multiple receive and transmit queues
27b583b9a1SFerruh Yigit* RSS hash
28b583b9a1SFerruh Yigit* RSS indirection table configuration
29b583b9a1SFerruh Yigit* Low Latency Queue for Tx
30b583b9a1SFerruh Yigit* Basic and extended statistics
31b583b9a1SFerruh Yigit* LSC event notification
32b583b9a1SFerruh Yigit* Watchdog (requires handling of timers in the application)
33b583b9a1SFerruh Yigit* Device reset upon failure
34b583b9a1SFerruh Yigit* Rx interrupts
35b583b9a1SFerruh Yigit
36cf8a122cSJan MedalaOverview
37cf8a122cSJan Medala--------
38cf8a122cSJan Medala
39cf8a122cSJan MedalaThe ENA driver exposes a lightweight management interface with a
40cf8a122cSJan Medalaminimal set of memory mapped registers and an extendable command set
41cf8a122cSJan Medalathrough an Admin Queue.
42cf8a122cSJan Medala
43cf8a122cSJan MedalaThe driver supports a wide range of ENA adapters, is link-speed
44cf8a122cSJan Medalaindependent (i.e., the same driver is used for 10GbE, 25GbE, 40GbE,
45cf8a122cSJan Medalaetc.), and it negotiates and supports an extendable feature set.
46cf8a122cSJan Medala
47cf8a122cSJan MedalaENA adapters allow high speed and low overhead Ethernet traffic
48cf8a122cSJan Medalaprocessing by providing a dedicated Tx/Rx queue pair per CPU core.
49cf8a122cSJan Medala
50cf8a122cSJan MedalaThe ENA driver supports industry standard TCP/IP offload features such
51cf8a122cSJan Medalaas checksum offload and TCP transmit segmentation offload (TSO).
52cf8a122cSJan Medala
53cf8a122cSJan MedalaReceive-side scaling (RSS) is supported for multi-core scaling.
54cf8a122cSJan Medala
55cf8a122cSJan MedalaSome of the ENA devices support a working mode called Low-latency
56cf8a122cSJan MedalaQueue (LLQ), which saves several more microseconds.
57cf8a122cSJan Medala
58cf8a122cSJan MedalaManagement Interface
59cf8a122cSJan Medala--------------------
60cf8a122cSJan Medala
61cf8a122cSJan MedalaENA management interface is exposed by means of:
62cf8a122cSJan Medala
63cf8a122cSJan Medala* Device Registers
64cf8a122cSJan Medala* Admin Queue (AQ) and Admin Completion Queue (ACQ)
65cf8a122cSJan Medala
66cf8a122cSJan MedalaENA device memory-mapped PCIe space for registers (MMIO registers)
67cf8a122cSJan Medalaare accessed only during driver initialization and are not involved
68cf8a122cSJan Medalain further normal device operation.
69cf8a122cSJan Medala
70cf8a122cSJan MedalaAQ is used for submitting management commands, and the
71cf8a122cSJan Medalaresults/responses are reported asynchronously through ACQ.
72cf8a122cSJan Medala
73cf8a122cSJan MedalaENA introduces a very small set of management commands with room for
74cf8a122cSJan Medalavendor-specific extensions. Most of the management operations are
75cf8a122cSJan Medalaframed in a generic Get/Set feature command.
76cf8a122cSJan Medala
77cf8a122cSJan MedalaThe following admin queue commands are supported:
78cf8a122cSJan Medala
79cf8a122cSJan Medala* Create I/O submission queue
80cf8a122cSJan Medala* Create I/O completion queue
81cf8a122cSJan Medala* Destroy I/O submission queue
82cf8a122cSJan Medala* Destroy I/O completion queue
83cf8a122cSJan Medala* Get feature
84cf8a122cSJan Medala* Set feature
85cf8a122cSJan Medala* Get statistics
86cf8a122cSJan Medala
87cf8a122cSJan MedalaRefer to ``ena_admin_defs.h`` for the list of supported Get/Set Feature
88cf8a122cSJan Medalaproperties.
89cf8a122cSJan Medala
90cf8a122cSJan MedalaData Path Interface
91cf8a122cSJan Medala-------------------
92cf8a122cSJan Medala
93cf8a122cSJan MedalaI/O operations are based on Tx and Rx Submission Queues (Tx SQ and Rx
94cf8a122cSJan MedalaSQ correspondingly). Each SQ has a completion queue (CQ) associated
95cf8a122cSJan Medalawith it.
96cf8a122cSJan Medala
97cf8a122cSJan MedalaThe SQs and CQs are implemented as descriptor rings in contiguous
98cf8a122cSJan Medalaphysical memory.
99cf8a122cSJan Medala
100cf8a122cSJan MedalaRefer to ``ena_eth_io_defs.h`` for the detailed structure of the descriptor
101cf8a122cSJan Medala
102cf8a122cSJan MedalaThe driver supports multi-queue for both Tx and Rx.
103cf8a122cSJan Medala
104b583b9a1SFerruh YigitConfiguration
105b583b9a1SFerruh Yigit-------------
106cf8a122cSJan Medala
107b583b9a1SFerruh YigitRuntime Configuration
108b583b9a1SFerruh Yigit^^^^^^^^^^^^^^^^^^^^^
1098a7a73f2SMichal Krawczyk
110*d7918d19SShai Brandes   * **llq_policy** (default 1)
1118a7a73f2SMichal Krawczyk
112*d7918d19SShai Brandes     Controls whether use device recommended header policy or override it:
113*d7918d19SShai Brandes
114*d7918d19SShai Brandes     0 - Disable LLQ (Use with extreme caution as it leads to a huge performance
115*d7918d19SShai Brandes     degradation on AWS instances built with Nitro v4 onwards).
116*d7918d19SShai Brandes
117*d7918d19SShai Brandes     1 - Accept device recommended LLQ policy (Default).
118*d7918d19SShai Brandes
119*d7918d19SShai Brandes     2 - Enforce normal LLQ policy.
120*d7918d19SShai Brandes
121*d7918d19SShai Brandes     3 - Enforce large LLQ policy.
1228a7a73f2SMichal Krawczyk
123cc0c5d25SMichal Krawczyk   * **miss_txc_to** (default 5)
124cc0c5d25SMichal Krawczyk
125cc0c5d25SMichal Krawczyk     Number of seconds after which the Tx packet will be considered missing.
126cc0c5d25SMichal Krawczyk     If the missing packets number will exceed dynamically calculated threshold,
127cc0c5d25SMichal Krawczyk     the driver will trigger the device reset which should be handled by the
128cc0c5d25SMichal Krawczyk     application. Checking for missing Tx completions happens in the driver's
129cc0c5d25SMichal Krawczyk     timer service. Setting this parameter to 0 disables this feature. Maximum
130cc0c5d25SMichal Krawczyk     allowed value is 60 seconds.
131cc0c5d25SMichal Krawczyk
132ca1dfa85SShai Brandes   * **control_poll_interval** (default 0)
133ca1dfa85SShai Brandes
134ca1dfa85SShai Brandes     Enable polling-based functionality of the admin queues,
135ca1dfa85SShai Brandes     eliminating the need for interrupts in the control-path:
136ca1dfa85SShai Brandes
137ca1dfa85SShai Brandes     0 - Disable (Admin queue will work in interrupt mode).
138ca1dfa85SShai Brandes
139ca1dfa85SShai Brandes     [1..1000] - Number of milliseconds to wait between periodic inspection of the admin queues.
140ca1dfa85SShai Brandes
141ca1dfa85SShai Brandes     **A non-zero value for this devarg is mandatory for control path functionality
142ca1dfa85SShai Brandes     when binding ports to uio_pci_generic kernel module which lacks interrupt support.**
143ca1dfa85SShai Brandes
144ca1dfa85SShai Brandes
145b583b9a1SFerruh YigitENA Configuration Parameters
146b583b9a1SFerruh Yigit^^^^^^^^^^^^^^^^^^^^^^^^^^^^
147cf8a122cSJan Medala
148cf8a122cSJan Medala   * **Number of Queues**
149cf8a122cSJan Medala
150cf8a122cSJan Medala     This is the requested number of queues upon initialization, however, the actual
151cf8a122cSJan Medala     number of receive and transmit queues to be created will be the minimum between
152cf8a122cSJan Medala     the maximal number supported by the device and number of queues requested.
153cf8a122cSJan Medala
154cf8a122cSJan Medala   * **Size of Queues**
155cf8a122cSJan Medala
156cf8a122cSJan Medala     This is the requested size of receive/transmit queues, while the actual size
157cf8a122cSJan Medala     will be the minimum between the requested size and the maximal receive/transmit
158cf8a122cSJan Medala     supported by the device.
159cf8a122cSJan Medala
160cf8a122cSJan MedalaBuilding DPDK
161cf8a122cSJan Medala-------------
162cf8a122cSJan Medala
163cf8a122cSJan MedalaSee the :ref:`DPDK Getting Started Guide for Linux <linux_gsg>` for
164cf8a122cSJan Medalainstructions on how to build DPDK.
165cf8a122cSJan Medala
166cf8a122cSJan MedalaBy default the ENA PMD library will be built into the DPDK library.
167cf8a122cSJan Medala
168cf8a122cSJan MedalaFor configuring and using UIO and VFIO frameworks, please also refer :ref:`the
169cf8a122cSJan Medaladocumentation that comes with DPDK suite <linux_gsg>`.
170cf8a122cSJan Medala
171cf8a122cSJan MedalaSupported Operating Systems
172cf8a122cSJan Medala---------------------------
173cf8a122cSJan Medala
174cf8a122cSJan MedalaAny Linux distribution fulfilling the conditions described in ``System Requirements``
175cf8a122cSJan Medalasection of :ref:`the DPDK documentation <linux_gsg>` or refer to *DPDK Release Notes*.
176cf8a122cSJan Medala
177cf8a122cSJan MedalaPrerequisites
178cf8a122cSJan Medala-------------
179cf8a122cSJan Medala
180cf8a122cSJan Medala#. Prepare the system as recommended by DPDK suite.  This includes environment
1814d0155dbSRafal Kozik   variables, hugepages configuration, tool-chains and configuration.
182cf8a122cSJan Medala
183ca1dfa85SShai Brandes#. ENA PMD can operate with ``vfio-pci`` (*), ``igb_uio``, or ``uio_pci_generic`` driver.
18461e09339SRafal Kozik
18561e09339SRafal Kozik   (*) ENAv2 hardware supports Low Latency Queue v2 (LLQv2). This feature
18661e09339SRafal Kozik   reduces the latency of the packets by pushing the header directly through
18761e09339SRafal Kozik   the PCI to the device, before the DMA is even triggered. For proper work
188ca1dfa85SShai Brandes   kernel PCI driver must support write-combining (WC).
18956bb5841SThomas Monjalon   In DPDK ``igb_uio`` it must be enabled by loading module with
19061e09339SRafal Kozik   ``wc_activate=1`` flag (example below). However, mainline's vfio-pci
191ca1dfa85SShai Brandes   driver in kernel doesn't have WC support yet (planned to be added).
192036ecae0SMichal Krawczyk   If vfio-pci is used user should follow `AWS ENA PMD documentation
193036ecae0SMichal Krawczyk   <https://github.com/amzn/amzn-drivers/tree/master/userspace/dpdk/README.md>`_.
194cf8a122cSJan Medala
195ca1dfa85SShai Brandes#. For ``igb_uio``:
196ca1dfa85SShai Brandes   Insert ``igb_uio`` kernel module using the command ``modprobe uio; insmod igb_uio.ko wc_activate=1``
1974d0155dbSRafal Kozik
198ca1dfa85SShai Brandes#. For ``vfio-pci``:
199ca1dfa85SShai Brandes   Insert ``vfio-pci`` kernel module using the command ``modprobe vfio-pci``
2004d0155dbSRafal Kozik   Please make sure that ``IOMMU`` is enabled in your system,
2014d0155dbSRafal Kozik   or use ``vfio`` driver in ``noiommu`` mode::
2024d0155dbSRafal Kozik
2034d0155dbSRafal Kozik     echo 1 > /sys/module/vfio/parameters/enable_unsafe_noiommu_mode
2044d0155dbSRafal Kozik
205fe9a344cSMichal Krawczyk   To use ``noiommu`` mode, the ``vfio-pci`` must be built with flag
206fe9a344cSMichal Krawczyk   ``CONFIG_VFIO_NOIOMMU``.
207fe9a344cSMichal Krawczyk
208ca1dfa85SShai Brandes#. For ``uio_pci_generic``:
209ca1dfa85SShai Brandes   Insert ``uio_pci_generic`` kernel module using the command ``modprobe uio_pci_generic``.
210ca1dfa85SShai Brandes   Make sure that the IOMMU is disabled or is in passthrough mode.
211ca1dfa85SShai Brandes   For example: ``modprobe uio_pci_generic intel_iommu=off``.
212ca1dfa85SShai Brandes
213ca1dfa85SShai Brandes   Note that when launching the application,
214ca1dfa85SShai Brandes   the ``control_poll_interval`` devarg must be used with a non-zero value (1000 is recommended)
215ca1dfa85SShai Brandes   as ``uio_pci_generic`` lacks interrupt support.
216ca1dfa85SShai Brandes   The control-path (admin queues) of the ENA requires poll-mode
217ca1dfa85SShai Brandes   to process command completion and asynchronous notification from the device.
218ca1dfa85SShai Brandes   For example: ``dpdk-app -a "00:06.0,control_path_poll_interval=1000"``.
219ca1dfa85SShai Brandes
220ca1dfa85SShai Brandes#. Bind the intended ENA device to ``vfio-pci``, ``igb_uio``, or ``uio_pci_generic`` module.
221cf8a122cSJan Medala
222cf8a122cSJan MedalaAt this point the system should be ready to run DPDK applications. Once the
223fe9a344cSMichal Krawczykapplication runs to completion, the ENA can be detached from attached module if
224fe9a344cSMichal Krawczyknecessary.
225fe9a344cSMichal Krawczyk
2266986cdc4SMichal Krawczyk**Rx interrupts support**
2276986cdc4SMichal Krawczyk
228ca1dfa85SShai BrandesENA PMD supports Rx interrupts, which can be used to wake up lcores waiting for input.
229ca1dfa85SShai BrandesPlease note that it won't work with ``igb_uio`` and ``uio_pci_generic``
230ca1dfa85SShai Brandesso to use this feature, the ``vfio-pci`` should be used.
2316986cdc4SMichal Krawczyk
2326986cdc4SMichal KrawczykENA handles admin interrupts and AENQ notifications on separate interrupt.
2336986cdc4SMichal KrawczykThere is possibility that there won't be enough event file descriptors to
2346986cdc4SMichal Krawczykhandle both admin and Rx interrupts. In that situation the Rx interrupt request
2356986cdc4SMichal Krawczykwill fail.
2366986cdc4SMichal Krawczyk
237fe9a344cSMichal Krawczyk**Note about usage on \*.metal instances**
238fe9a344cSMichal Krawczyk
239ca1dfa85SShai BrandesOn AWS, the metal instances are supporting IOMMU for both arm64 and x86_64 hosts.
240ca1dfa85SShai BrandesNote that ``uio_pci_generic`` lacks IOMMU support and cannot be used for metal instances.
241fe9a344cSMichal Krawczyk
242fe9a344cSMichal Krawczyk* x86_64 (e.g. c5.metal, i3.metal):
243fe9a344cSMichal Krawczyk   IOMMU should be disabled by default. In that situation, the ``igb_uio`` can
244fe9a344cSMichal Krawczyk   be used as it is but ``vfio-pci`` should be working in no-IOMMU mode (please
245fe9a344cSMichal Krawczyk   see above).
246fe9a344cSMichal Krawczyk
247fe9a344cSMichal Krawczyk   When IOMMU is enabled, ``igb_uio`` cannot be used as it's not supporting this
248fe9a344cSMichal Krawczyk   feature, while ``vfio-pci`` should work without any changes.
249fe9a344cSMichal Krawczyk   To enable IOMMU on those hosts, please update ``GRUB_CMDLINE_LINUX`` in file
250fe9a344cSMichal Krawczyk   ``/etc/default/grub`` with the below extra boot arguments::
251fe9a344cSMichal Krawczyk
252fe9a344cSMichal Krawczyk    iommu=1 intel_iommu=on
253fe9a344cSMichal Krawczyk
254fe9a344cSMichal Krawczyk   Then, make the changes live by executing as a root::
255fe9a344cSMichal Krawczyk
256fe9a344cSMichal Krawczyk    # grub2-mkconfig > /boot/grub2/grub.cfg
257fe9a344cSMichal Krawczyk
258fe9a344cSMichal Krawczyk   Finally, reboot should result in IOMMU being enabled.
259fe9a344cSMichal Krawczyk
260fe9a344cSMichal Krawczyk* arm64 (a1.metal):
261fe9a344cSMichal Krawczyk   IOMMU should be enabled by default. Unfortunately, ``vfio-pci`` isn't
262fe9a344cSMichal Krawczyk   supporting SMMU, which is implementation of IOMMU for arm64 architecture and
263fe9a344cSMichal Krawczyk   ``igb_uio`` isn't supporting IOMMU at all, so to use DPDK with ENA on those
264fe9a344cSMichal Krawczyk   hosts, one must disable IOMMU. This can be done by updating
265fe9a344cSMichal Krawczyk   ``GRUB_CMDLINE_LINUX`` in file ``/etc/default/grub`` with the extra boot
266fe9a344cSMichal Krawczyk   argument::
267fe9a344cSMichal Krawczyk
268fe9a344cSMichal Krawczyk    iommu.passthrough=1
269fe9a344cSMichal Krawczyk
270fe9a344cSMichal Krawczyk   Then, make the changes live by executing as a root::
271fe9a344cSMichal Krawczyk
272fe9a344cSMichal Krawczyk    # grub2-mkconfig > /boot/grub2/grub.cfg
273fe9a344cSMichal Krawczyk
274fe9a344cSMichal Krawczyk   Finally, reboot should result in IOMMU being disabled.
275fe9a344cSMichal Krawczyk   Without IOMMU, ``igb_uio`` can be used as it is but ``vfio-pci`` should be
276fe9a344cSMichal Krawczyk   working in no-IOMMU mode (please see above).
277cf8a122cSJan Medala
278cf8a122cSJan MedalaUsage example
279cf8a122cSJan Medala-------------
280cf8a122cSJan Medala
281ec38d8b6SShijith ThottonFollow instructions available in the document
282ec38d8b6SShijith Thotton:ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>` to launch
2838809f78cSBruce Richardson**testpmd** with Amazon ENA devices managed by librte_net_ena.
284cf8a122cSJan Medala
285cf8a122cSJan MedalaExample output:
286cf8a122cSJan Medala
287cf8a122cSJan Medala.. code-block:: console
288cf8a122cSJan Medala
289cf8a122cSJan Medala   [...]
2903d62ecd8SMichal Krawczyk   EAL: PCI device 0000:00:06.0 on NUMA socket -1
291e9b3d79bSDmitry Kozlyuk   EAL: Device 0000:00:06.0 is not NUMA-aware, defaulting socket to 0
2923d62ecd8SMichal Krawczyk   EAL:   probe driver: 1d0f:ec20 net_ena
2933d62ecd8SMichal Krawczyk
294cf8a122cSJan Medala   Interactive-mode selected
2953d62ecd8SMichal Krawczyk   testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=171456, size=2176, socket=0
2963d62ecd8SMichal Krawczyk   testpmd: preferred mempool ops selected: ring_mp_mc
2973d62ecd8SMichal Krawczyk   Warning! port-topology=paired and odd forward ports number, the last port will pair with itself.
298cf8a122cSJan Medala   Configuring Port 0 (socket 0)
299cf8a122cSJan Medala   Port 0: 00:00:00:11:00:01
300cf8a122cSJan Medala   Checking link statuses...
3013d62ecd8SMichal Krawczyk
302cf8a122cSJan Medala   Done
303cf8a122cSJan Medala   testpmd>
304