1dd35c0d6SHyong Youb Kim.. SPDX-License-Identifier: BSD-3-Clause 2d4f954b1SJohn Daley Copyright (c) 2017, Cisco Systems, Inc. 3211f9a9cSJohn Daley All rights reserved. 4211f9a9cSJohn Daley 5211f9a9cSJohn DaleyENIC Poll Mode Driver 6211f9a9cSJohn Daley===================== 7211f9a9cSJohn Daley 8211f9a9cSJohn DaleyENIC PMD is the DPDK poll-mode driver for the Cisco System Inc. VIC Ethernet 9211f9a9cSJohn DaleyNICs. These adapters are also referred to as vNICs below. If you are running 10211f9a9cSJohn Daleyor would like to run DPDK software applications on Cisco UCS servers using 11211f9a9cSJohn DaleyCisco VIC adapters the following documentation is relevant. 12211f9a9cSJohn Daley 13b583b9a1SFerruh YigitSupported Cisco VIC adapters 14b583b9a1SFerruh Yigit---------------------------- 15b583b9a1SFerruh Yigit 16b583b9a1SFerruh YigitENIC PMD supports all recent generations of Cisco VIC adapters including: 17b583b9a1SFerruh Yigit 18b583b9a1SFerruh Yigit- VIC 1200 series 19b583b9a1SFerruh Yigit- VIC 1300 series 20*543617f4SHyong Youb Kim- VIC 1400/14000 series 21*543617f4SHyong Youb Kim- VIC 15000 series 22b583b9a1SFerruh Yigit 23b583b9a1SFerruh YigitSupported features 24b583b9a1SFerruh Yigit------------------ 25b583b9a1SFerruh Yigit 26b583b9a1SFerruh Yigit- Unicast, multicast and broadcast transmission and reception 27b583b9a1SFerruh Yigit- Receive queue polling 28b583b9a1SFerruh Yigit- Port Hardware Statistics 29b583b9a1SFerruh Yigit- Hardware VLAN acceleration 30b583b9a1SFerruh Yigit- IP checksum offload 31b583b9a1SFerruh Yigit- Receive side VLAN stripping 32b583b9a1SFerruh Yigit- Multiple receive and transmit queues 33b583b9a1SFerruh Yigit- Promiscuous mode 34b583b9a1SFerruh Yigit- Setting RX VLAN (supported via UCSM/CIMC only) 35b583b9a1SFerruh Yigit- VLAN filtering (supported via UCSM/CIMC only) 36b583b9a1SFerruh Yigit- Execution of application by unprivileged system users 37b583b9a1SFerruh Yigit- IPV4, IPV6 and TCP RSS hashing 38b583b9a1SFerruh Yigit- UDP RSS hashing (1400 series and later adapters) 39b583b9a1SFerruh Yigit- Scattered Rx 40b583b9a1SFerruh Yigit- MTU update 4100ce4311SHyong Youb Kim- SR-IOV virtual function 42b583b9a1SFerruh Yigit- Flow API 43b583b9a1SFerruh Yigit- Overlay offload 44b583b9a1SFerruh Yigit 45b583b9a1SFerruh Yigit - Rx/Tx checksum offloads for VXLAN, NVGRE, GENEVE 46b583b9a1SFerruh Yigit - TSO for VXLAN and GENEVE packets 47b583b9a1SFerruh Yigit - Inner RSS 48b583b9a1SFerruh Yigit 49211f9a9cSJohn DaleyHow to obtain ENIC PMD integrated DPDK 50211f9a9cSJohn Daley-------------------------------------- 51211f9a9cSJohn Daley 52211f9a9cSJohn DaleyENIC PMD support is integrated into the DPDK suite. dpdk-<version>.tar.gz 533d4b2afbSDavid Marchandshould be downloaded from https://core.dpdk.org/download/ 54211f9a9cSJohn Daley 55211f9a9cSJohn Daley 56211f9a9cSJohn DaleyConfiguration information 57211f9a9cSJohn Daley------------------------- 58211f9a9cSJohn Daley 59211f9a9cSJohn Daley- **vNIC Configuration Parameters** 60211f9a9cSJohn Daley 61211f9a9cSJohn Daley - **Number of Queues** 62211f9a9cSJohn Daley 63b16e60abSNelson Escobar The maximum number of receive queues (RQs), work queues (WQs) and 64b16e60abSNelson Escobar completion queues (CQs) are configurable on a per vNIC basis 65b16e60abSNelson Escobar through the Cisco UCS Manager (CIMC or UCSM). 66b16e60abSNelson Escobar 67b16e60abSNelson Escobar These values should be configured as follows: 68b16e60abSNelson Escobar 69b16e60abSNelson Escobar - The number of WQs should be greater or equal to the value of the 705d4f3ad6SHyong Youb Kim expected nb_tx_q parameter in the call to 71b16e60abSNelson Escobar rte_eth_dev_configure() 72b16e60abSNelson Escobar 73b16e60abSNelson Escobar - The number of RQs configured in the vNIC should be greater or 74b16e60abSNelson Escobar equal to *twice* the value of the expected nb_rx_q parameter in 75d4f954b1SJohn Daley the call to rte_eth_dev_configure(). With the addition of Rx 76b16e60abSNelson Escobar scatter, a pair of RQs on the vnic is needed for each receive 77d4f954b1SJohn Daley queue used by DPDK, even if Rx scatter is not being used. 78b16e60abSNelson Escobar Having a vNIC with only 1 RQ is not a valid configuration, and 79b16e60abSNelson Escobar will fail with an error message. 80b16e60abSNelson Escobar 81b16e60abSNelson Escobar - The number of CQs should set so that there is one CQ for each 82b16e60abSNelson Escobar WQ, and one CQ for each pair of RQs. 83b16e60abSNelson Escobar 84b16e60abSNelson Escobar For example: If the application requires 3 Rx queues, and 3 Tx 85b16e60abSNelson Escobar queues, the vNIC should be configured to have at least 3 WQs, 6 86b16e60abSNelson Escobar RQs (3 pairs), and 6 CQs (3 for use by WQs + 3 for use by the 3 87b16e60abSNelson Escobar pairs of RQs). 88211f9a9cSJohn Daley 89211f9a9cSJohn Daley - **Size of Queues** 90211f9a9cSJohn Daley 91211f9a9cSJohn Daley Likewise, the number of receive and transmit descriptors are configurable on 925d4f3ad6SHyong Youb Kim a per-vNIC basis via the UCS Manager and should be greater than or equal to 93211f9a9cSJohn Daley the nb_rx_desc and nb_tx_desc parameters expected to be used in the calls 94211f9a9cSJohn Daley to rte_eth_rx_queue_setup() and rte_eth_tx_queue_setup() respectively. 95b16e60abSNelson Escobar An application requesting more than the set size will be limited to that 96b16e60abSNelson Escobar size. 97b16e60abSNelson Escobar 98b16e60abSNelson Escobar Unless there is a lack of resources due to creating many vNICs, it 99b16e60abSNelson Escobar is recommended that the WQ and RQ sizes be set to the maximum. This 100b16e60abSNelson Escobar gives the application the greatest amount of flexibility in its 101b16e60abSNelson Escobar queue configuration. 102b16e60abSNelson Escobar 103d4f954b1SJohn Daley - *Note*: Since the introduction of Rx scatter, for performance 104b16e60abSNelson Escobar reasons, this PMD uses two RQs on the vNIC per receive queue in 1055d4f3ad6SHyong Youb Kim DPDK. One RQ holds descriptors for the start of a packet, and the 106b16e60abSNelson Escobar second RQ holds the descriptors for the rest of the fragments of 107b16e60abSNelson Escobar a packet. This means that the nb_rx_desc parameter to 108b16e60abSNelson Escobar rte_eth_rx_queue_setup() can be a greater than 4096. The exact 109b16e60abSNelson Escobar amount will depend on the size of the mbufs being used for 110b16e60abSNelson Escobar receives, and the MTU size. 111b16e60abSNelson Escobar 112b16e60abSNelson Escobar For example: If the mbuf size is 2048, and the MTU is 9000, then 113b16e60abSNelson Escobar receiving a full size packet will take 5 descriptors, 1 from the 1145d4f3ad6SHyong Youb Kim start-of-packet queue, and 4 from the second queue. Assuming 115b16e60abSNelson Escobar that the RQ size was set to the maximum of 4096, then the 116b16e60abSNelson Escobar application can specify up to 1024 + 4096 as the nb_rx_desc 117b16e60abSNelson Escobar parameter to rte_eth_rx_queue_setup(). 118211f9a9cSJohn Daley 119211f9a9cSJohn Daley - **Interrupts** 120211f9a9cSJohn Daley 1210f872d31SHyong Youb Kim At least one interrupt per vNIC interface should be configured in the UCS 122211f9a9cSJohn Daley manager regardless of the number receive/transmit queues. The ENIC PMD 12353fa8cc0SNelson Escobar uses this interrupt to get information about link status and errors 12453fa8cc0SNelson Escobar in the fast path. 125211f9a9cSJohn Daley 1260f872d31SHyong Youb Kim In addition to the interrupt for link status and errors, when using Rx queue 1270f872d31SHyong Youb Kim interrupts, increase the number of configured interrupts so that there is at 1280f872d31SHyong Youb Kim least one interrupt for each Rx queue. For example, if the app uses 3 Rx 1290f872d31SHyong Youb Kim queues and wants to use per-queue interrupts, configure 4 (3 + 1) interrupts. 1300f872d31SHyong Youb Kim 131e53f18e8SHyong Youb Kim - **Receive Side Scaling** 132e53f18e8SHyong Youb Kim 133e53f18e8SHyong Youb Kim In order to fully utilize RSS in DPDK, enable all RSS related settings in 134e53f18e8SHyong Youb Kim CIMC or UCSM. These include the following items listed under 135e53f18e8SHyong Youb Kim Receive Side Scaling: 136e53f18e8SHyong Youb Kim TCP, IPv4, TCP-IPv4, IPv6, TCP-IPv6, IPv6 Extension, TCP-IPv6 Extension. 137e53f18e8SHyong Youb Kim 138e53f18e8SHyong Youb Kim 13900ce4311SHyong Youb KimSR-IOV Virtual Function 1403c389587SJohn Daley----------------------- 1413c389587SJohn Daley 14200ce4311SHyong Youb KimVIC 1400 and later series supports SR-IOV. 14300ce4311SHyong Youb KimIt can be enabled via both UCSM and CIMC. 14400ce4311SHyong Youb KimPlease refer to the following guides to enable SR-IOV virtual functions (VFs). 1453c389587SJohn Daley 14600ce4311SHyong Youb Kim - CIMC: `Managing vNICs <https://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/c/sw/gui/config/guide/4_3/b_cisco_ucs_c-series_gui_configuration_guide_43/b_Cisco_UCS_C-series_GUI_Configuration_Guide_41_chapter_01011.html#d77871e5874a1635>`_ 1473c389587SJohn Daley 14800ce4311SHyong Youb Kim - UCSM: `Configuring SRIOV HPN Connection Policies <https://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/ucs-manager/GUI-User-Guides/Network-Mgmt/4-3/b_UCSM_Network_Mgmt_Guide_4_3/b_UCSM_Network_Mgmt_Guide_chapter_01010.html#d21438e9555a1635>`_ 1493c389587SJohn Daley 15000ce4311SHyong Youb KimNote that the previous SR-IOV implementation that is tied to VM-FEX 15100ce4311SHyong Youb Kim(Cisco Virtual Machine Fabric Extender) has been discontinued, 15200ce4311SHyong Youb Kimand ENIC PMD no longer supports it. 15300ce4311SHyong Youb KimThe current SR-IOV implementation does not require the Fabric Interconnect (FI), 15400ce4311SHyong Youb Kimas layer 2 switching is done within the VIC adapter. 15500ce4311SHyong Youb Kim 15600ce4311SHyong Youb KimOnce SR-IOV is enabled, reboot the host OS and follow OS specific steps to create VFs 15700ce4311SHyong Youb Kimand assign them to virtual machines (VMs) or containers as necessary. 15800ce4311SHyong Youb KimThe VIC physical function (PF) drivers for ESXi and Linux support SR-IOV. 15900ce4311SHyong Youb KimThe following shows simplified steps for Linux. 1603c389587SJohn Daley 1613c389587SJohn Daley.. code-block:: console 1623c389587SJohn Daley 16300ce4311SHyong Youb Kim # echo 4 > /sys/class/net/<pf-interface>/device/sriov_numvfs 16400ce4311SHyong Youb Kim 1653c389587SJohn Daley # lspci | grep Cisco | grep Ethernet 16600ce4311SHyong Youb Kim 12:00.0 Ethernet controller: Cisco Systems Inc VIC Ethernet NIC (rev a2) 16700ce4311SHyong Youb Kim 12:00.1 Ethernet controller: Cisco Systems Inc Device 02b7 (rev a2) 16800ce4311SHyong Youb Kim 12:00.2 Ethernet controller: Cisco Systems Inc Device 02b7 (rev a2) 16900ce4311SHyong Youb Kim 12:00.3 Ethernet controller: Cisco Systems Inc Device 02b7 (rev a2) 17000ce4311SHyong Youb Kim 12:00.4 Ethernet controller: Cisco Systems Inc Device 02b7 (rev a2) 1713c389587SJohn Daley 17200ce4311SHyong Youb KimWriting 4 to ``sriov_numvfs`` creates 4 VFs. 17300ce4311SHyong Youb Kim``lspci`` shows VFs and their PCI locations. 17400ce4311SHyong Youb KimInterfaces with device ID ``02b7`` are the VFs. 17500ce4311SHyong Youb KimThe following snippet for libvirt XML assigns VF at ``12:00.1`` to VM. 1763c389587SJohn Daley 1773c389587SJohn Daley.. code-block:: console 1783c389587SJohn Daley 17900ce4311SHyong Youb Kim <interface type="hostdev" managed="yes"> 18000ce4311SHyong Youb Kim <mac address="fa:16:3e:46:39:c5"/> 181e53f18e8SHyong Youb Kim <driver name='vfio'/> 1823c389587SJohn Daley <source> 18300ce4311SHyong Youb Kim <address type="pci" domain="0x0000" bus="0x12" slot="0x00" function="0x1"/> 1843c389587SJohn Daley </source> 18500ce4311SHyong Youb Kim <vlan> 18600ce4311SHyong Youb Kim <tag id="1000"/> 18700ce4311SHyong Youb Kim </vlan> 188e53f18e8SHyong Youb Kim </interface> 189e53f18e8SHyong Youb Kim 19000ce4311SHyong Youb KimWhen the VM instance is started, libvirt will bind the host VF to vfio-pci. 19100ce4311SHyong Youb KimIn the VM instance, the VF will now be visible. 19200ce4311SHyong Youb KimIn this example, VF at ``07:00.0`` is seen on the VM instance 19300ce4311SHyong Youb Kimand is available for binding to DPDK. 1943c389587SJohn Daley 1953c389587SJohn Daley.. code-block:: console 1963c389587SJohn Daley 19700ce4311SHyong Youb Kim # lspci | grep Cisco 19800ce4311SHyong Youb Kim 07:00.0 Ethernet controller: Cisco Systems Inc Device 02b7 (rev a2) 1993c389587SJohn Daley 20000ce4311SHyong Youb KimThere are two known limitations of the current SR-IOV implementation. 2013c389587SJohn Daley 20200ce4311SHyong Youb Kim - Software Rx statistics 2033c389587SJohn Daley 20400ce4311SHyong Youb Kim VF on old VIC models does not have hardware Rx counters. In this case, 20500ce4311SHyong Youb Kim ENIC PMD counts packets/bytes and reports them as device statistics. 206e53f18e8SHyong Youb Kim 20700ce4311SHyong Youb Kim - Backward compatibility mode 208e53f18e8SHyong Youb Kim 20900ce4311SHyong Youb Kim Old PF drivers on ESXi may lack full admin channel support. 21000ce4311SHyong Youb Kim ENIC PMD detects such PF driver during initialization 21100ce4311SHyong Youb Kim and reverts to the compatibility mode. 21200ce4311SHyong Youb Kim In this mode, ENIC PMD does not use the admin channel, 21300ce4311SHyong Youb Kim and trust mode (e.g. enabling promiscuous mode on VF) is not supported. 214e53f18e8SHyong Youb Kim 215e53f18e8SHyong Youb Kim.. note:: 216e53f18e8SHyong Youb Kim 21700ce4311SHyong Youb Kim Passthrough does not require SR-IOV. 21800ce4311SHyong Youb Kim If SR-IOV is not desired, the user may create as many regular vNICs as necessary 21900ce4311SHyong Youb Kim and assign them to VMs as passthrough devices. 220e53f18e8SHyong Youb Kim 221e53f18e8SHyong Youb Kim 222d629b7b5SJohn McNamara.. _enic-generic-flow-api: 2230543f9d2SJohn Daley 2240543f9d2SJohn DaleyGeneric Flow API support 2250543f9d2SJohn Daley------------------------ 2260543f9d2SJohn Daley 227fd88e740SJohn DaleyGeneric Flow API (also called "rte_flow" API) is supported. More advanced 228fd88e740SJohn Daleycapabilities are available when "Advanced Filtering" is enabled on the adapter. 229fd88e740SJohn DaleyAdvanced filtering was added to 1300 series VIC firmware starting with version 230fd88e740SJohn Daley2.0.13 for C-series UCS servers and version 3.1.2 for UCSM managed blade 231fd88e740SJohn Daleyservers. Advanced filtering is available on 1400 series adapters and beyond. 232fd88e740SJohn DaleyTo enable advanced filtering, the 'Advanced filter' radio button should be 233fd88e740SJohn Daleyselected via CIMC or UCSM followed by a reboot of the server. 2340543f9d2SJohn Daley 2350543f9d2SJohn Daley- **1200 series VICs** 2360543f9d2SJohn Daley 2375d4f3ad6SHyong Youb Kim 5-tuple exact flow support for 1200 series adapters. This allows: 2380543f9d2SJohn Daley 2390543f9d2SJohn Daley - Attributes: ingress 2400543f9d2SJohn Daley - Items: ipv4, ipv6, udp, tcp (must exactly match src/dst IP 2415d4f3ad6SHyong Youb Kim addresses and ports and all must be specified) 2420543f9d2SJohn Daley - Actions: queue and void 2430543f9d2SJohn Daley - Selectors: 'is' 2440543f9d2SJohn Daley 24508df773fSHyong Youb Kim- **1300 and later series VICS with advanced filters disabled** 2460543f9d2SJohn Daley 2470543f9d2SJohn Daley With advanced filters disabled, an IPv4 or IPv6 item must be specified 2480543f9d2SJohn Daley in the pattern. 2490543f9d2SJohn Daley 2500543f9d2SJohn Daley - Attributes: ingress 251593f1766SHyong Youb Kim - Items: eth, vlan, ipv4, ipv6, udp, tcp, vxlan, inner eth, vlan, ipv4, ipv6, udp, tcp 2520543f9d2SJohn Daley - Actions: queue and void 2530543f9d2SJohn Daley - Selectors: 'is', 'spec' and 'mask'. 'last' is not supported 2545d4f3ad6SHyong Youb Kim - In total, up to 64 bytes of mask is allowed across all headers 2550543f9d2SJohn Daley 25608df773fSHyong Youb Kim- **1300 and later series VICS with advanced filters enabled** 2570543f9d2SJohn Daley 2580543f9d2SJohn Daley - Attributes: ingress 259593f1766SHyong Youb Kim - Items: eth, vlan, ipv4, ipv6, udp, tcp, vxlan, raw, inner eth, vlan, ipv4, ipv6, udp, tcp 2605af7af4dSHyong Youb Kim - Actions: queue, mark, drop, flag, rss, passthru, and void 2610543f9d2SJohn Daley - Selectors: 'is', 'spec' and 'mask'. 'last' is not supported 2625d4f3ad6SHyong Youb Kim - In total, up to 64 bytes of mask is allowed across all headers 2630543f9d2SJohn Daley 264ea7768b5SHyong Youb Kim- **1400 and later series VICs with Flow Manager API enabled** 265ea7768b5SHyong Youb Kim 266ea7768b5SHyong Youb Kim - Attributes: ingress, egress 267ea7768b5SHyong Youb Kim - Items: eth, vlan, ipv4, ipv6, sctp, udp, tcp, vxlan, raw, inner eth, vlan, ipv4, ipv6, sctp, udp, tcp 268ea7768b5SHyong Youb Kim - Ingress Actions: count, drop, flag, jump, mark, port_id, passthru, queue, rss, vxlan_decap, vxlan_encap, and void 269ea7768b5SHyong Youb Kim - Egress Actions: count, drop, jump, passthru, vxlan_encap, and void 270ea7768b5SHyong Youb Kim - Selectors: 'is', 'spec' and 'mask'. 'last' is not supported 271ea7768b5SHyong Youb Kim - In total, up to 64 bytes of mask is allowed across all headers 272ea7768b5SHyong Youb Kim 273593f1766SHyong Youb KimThe VIC performs packet matching after applying VLAN strip. If VLAN 274593f1766SHyong Youb Kimstripping is enabled, EtherType in the ETH item corresponds to the 275593f1766SHyong Youb Kimstripped VLAN header's EtherType. Stripping does not affect the VLAN 276593f1766SHyong Youb Kimitem. TCI and EtherType in the VLAN item are matched against those in 277593f1766SHyong Youb Kimthe (stripped) VLAN header whether stripping is enabled or disabled. 278593f1766SHyong Youb Kim 2790543f9d2SJohn DaleyMore features may be added in future firmware and new versions of the VIC. 2800543f9d2SJohn DaleyPlease refer to the release notes. 2810543f9d2SJohn Daley 28293fb21fdSHyong Youb Kim.. _overlay_offload: 28393fb21fdSHyong Youb Kim 28493fb21fdSHyong Youb KimOverlay Offload 28593fb21fdSHyong Youb Kim--------------- 28693fb21fdSHyong Youb Kim 28793fb21fdSHyong Youb KimRecent hardware models support overlay offload. When enabled, the NIC performs 28893fb21fdSHyong Youb Kimthe following operations for VXLAN, NVGRE, and GENEVE packets. In all cases, 28993fb21fdSHyong Youb Kiminner and outer packets can be IPv4 or IPv6. 29093fb21fdSHyong Youb Kim 29193fb21fdSHyong Youb Kim- TSO for VXLAN and GENEVE packets. 29293fb21fdSHyong Youb Kim 29393fb21fdSHyong Youb Kim Hardware supports NVGRE TSO, but DPDK currently has no NVGRE offload flags. 29493fb21fdSHyong Youb Kim 29593fb21fdSHyong Youb Kim- Tx checksum offloads. 29693fb21fdSHyong Youb Kim 29793fb21fdSHyong Youb Kim The NIC fills in IPv4/UDP/TCP checksums for both inner and outer packets. 29893fb21fdSHyong Youb Kim 29993fb21fdSHyong Youb Kim- Rx checksum offloads. 30093fb21fdSHyong Youb Kim 30193fb21fdSHyong Youb Kim The NIC validates IPv4/UDP/TCP checksums of both inner and outer packets. 302daa02b5cSOlivier Matz Good checksum flags (e.g. ``RTE_MBUF_F_RX_L4_CKSUM_GOOD``) indicate that the inner 30393fb21fdSHyong Youb Kim packet has the correct checksum, and if applicable, the outer packet also 304daa02b5cSOlivier Matz has the correct checksum. Bad checksum flags (e.g. ``RTE_MBUF_F_RX_L4_CKSUM_BAD``) 30593fb21fdSHyong Youb Kim indicate that the inner and/or outer packets have invalid checksum values. 30693fb21fdSHyong Youb Kim 30793fb21fdSHyong Youb Kim- Inner Rx packet type classification 30893fb21fdSHyong Youb Kim 30993fb21fdSHyong Youb Kim PMD sets inner L3/L4 packet types (e.g. ``RTE_PTYPE_INNER_L4_TCP``), and 31093fb21fdSHyong Youb Kim ``RTE_PTYPE_TUNNEL_GRENAT`` to indicate that the packet is tunneled. 31193fb21fdSHyong Youb Kim PMD does not set L3/L4 packet types for outer packets. 31293fb21fdSHyong Youb Kim 31393fb21fdSHyong Youb Kim- Inner RSS 31493fb21fdSHyong Youb Kim 31593fb21fdSHyong Youb Kim RSS hash calculation, therefore queue selection, is done on inner packets. 31693fb21fdSHyong Youb Kim 31761c7b522SJohn DaleyIn order to enable overlay offload, enable VXLAN and/or Geneve on vNIC 31893fb21fdSHyong Youb Kimvia CIMC or UCSM followed by a reboot of the server. When PMD successfully 31961c7b522SJohn Daleyenables overlay offload, it prints one of the following messages on the console. 32093fb21fdSHyong Youb Kim 32193fb21fdSHyong Youb Kim.. code-block:: console 32293fb21fdSHyong Youb Kim 32361c7b522SJohn Daley Overlay offload is enabled (VxLAN) 32461c7b522SJohn Daley Overlay offload is enabled (Geneve) 32561c7b522SJohn Daley Overlay offload is enabled (VxLAN, Geneve) 32693fb21fdSHyong Youb Kim 32793fb21fdSHyong Youb KimBy default, PMD enables overlay offload if hardware supports it. To disable 32893fb21fdSHyong Youb Kimit, set ``devargs`` parameter ``disable-overlay=1``. For example:: 32993fb21fdSHyong Youb Kim 330db27370bSStephen Hemminger -a 12:00.0,disable-overlay=1 33193fb21fdSHyong Youb Kim 33261c7b522SJohn DaleyBy default, the NIC uses 4789 and 6081 as the VXLAN and Geneve ports, 33361c7b522SJohn Daleyrespectively. The user may change them through 33461c7b522SJohn Daley``rte_eth_dev_udp_tunnel_port_{add,delete}``. However, as the current 33561c7b522SJohn DaleyNIC has a single VXLAN port number and a single Geneve port number, 33661c7b522SJohn Daleythe user cannot configure multiple port numbers for each tunnel type. 337dd35c0d6SHyong Youb Kim 33861c7b522SJohn DaleyGeneve offload support has evolved over VIC models. On older models, 33961c7b522SJohn DaleyGeneve offload and advanced filters are mutually exclusive. This is 34061c7b522SJohn Daleyenforced by UCSM and CIMC, which only allow one of the two features 34161c7b522SJohn Daleyto be selected at one time. Newer VIC models do not have this restriction. 342c02a96fcSHyong Youb Kim 343dd35c0d6SHyong Youb KimIngress VLAN Rewrite 344dd35c0d6SHyong Youb Kim-------------------- 345dd35c0d6SHyong Youb Kim 346dd35c0d6SHyong Youb KimVIC adapters can tag, untag, or modify the VLAN headers of ingress 347dd35c0d6SHyong Youb Kimpackets. The ingress VLAN rewrite mode controls this behavior. By 348dd35c0d6SHyong Youb Kimdefault, it is set to pass-through, where the NIC does not modify the 349dd35c0d6SHyong Youb KimVLAN header in any way so that the application can see the original 350dd35c0d6SHyong Youb Kimheader. This mode is sufficient for many applications, but may not be 351dd35c0d6SHyong Youb Kimsuitable for others. Such applications may change the mode by setting 352dd35c0d6SHyong Youb Kim``devargs`` parameter ``ig-vlan-rewrite`` to one of the following. 353dd35c0d6SHyong Youb Kim 354dd35c0d6SHyong Youb Kim- ``pass``: Pass-through mode. The NIC does not modify the VLAN 355dd35c0d6SHyong Youb Kim header. This is the default mode. 356dd35c0d6SHyong Youb Kim 357dd35c0d6SHyong Youb Kim- ``priority``: Priority-tag default VLAN mode. If the ingress packet 358dd35c0d6SHyong Youb Kim is tagged with the default VLAN, the NIC replaces its VLAN header 359dd35c0d6SHyong Youb Kim with the priority tag (VLAN ID 0). 360dd35c0d6SHyong Youb Kim 361dd35c0d6SHyong Youb Kim- ``trunk``: Default trunk mode. The NIC tags untagged ingress packets 362dd35c0d6SHyong Youb Kim with the default VLAN. Tagged ingress packets are not modified. To 363dd35c0d6SHyong Youb Kim the application, every packet appears as tagged. 364dd35c0d6SHyong Youb Kim 365dd35c0d6SHyong Youb Kim- ``untag``: Untag default VLAN mode. If the ingress packet is tagged 366dd35c0d6SHyong Youb Kim with the default VLAN, the NIC removes or untags its VLAN header so 367dd35c0d6SHyong Youb Kim that the application sees an untagged packet. As a result, the 368dd35c0d6SHyong Youb Kim default VLAN becomes `untagged`. This mode can be useful for 369dd35c0d6SHyong Youb Kim applications such as OVS-DPDK performance benchmarks that utilize 370dd35c0d6SHyong Youb Kim only the default VLAN and want to see only untagged packets. 371dd35c0d6SHyong Youb Kim 3728a6ff33dSHyong Youb Kim 3738a6ff33dSHyong Youb KimVectorized Rx Handler 3748a6ff33dSHyong Youb Kim--------------------- 3758a6ff33dSHyong Youb Kim 3768a6ff33dSHyong Youb KimENIC PMD includes a version of the receive handler that is vectorized using 3778a6ff33dSHyong Youb KimAVX2 SIMD instructions. It is meant for bulk, throughput oriented workloads 3788a6ff33dSHyong Youb Kimwhere reducing cycles/packet in PMD is a priority. In order to use the 3798a6ff33dSHyong Youb Kimvectorized handler, take the following steps. 3808a6ff33dSHyong Youb Kim 3818a6ff33dSHyong Youb Kim- Use a recent version of gcc, icc, or clang and build 64-bit DPDK. If 3828a6ff33dSHyong Youb Kim the compiler is known to support AVX2, DPDK build system 3838a6ff33dSHyong Youb Kim automatically compiles the vectorized handler. Otherwise, the 3848a6ff33dSHyong Youb Kim handler is not available. 3858a6ff33dSHyong Youb Kim 3868a6ff33dSHyong Youb Kim- Set ``devargs`` parameter ``enable-avx2-rx=1`` to explicitly request that 3878a6ff33dSHyong Youb Kim PMD consider the vectorized handler when selecting the receive handler. 3888a6ff33dSHyong Youb Kim For example:: 3898a6ff33dSHyong Youb Kim 390db27370bSStephen Hemminger -a 12:00.0,enable-avx2-rx=1 3918a6ff33dSHyong Youb Kim 3928a6ff33dSHyong Youb Kim As the current implementation is intended for field trials, by default, the 393987e0defSStephen Hemminger vectorized handler is not considered (``enable-avx2-rx=0``). 3948a6ff33dSHyong Youb Kim 3958a6ff33dSHyong Youb Kim- Run on a UCS M4 or later server with CPUs that support AVX2. 3968a6ff33dSHyong Youb Kim 3978a6ff33dSHyong Youb KimPMD selects the vectorized handler when the handler is compiled into 3988a6ff33dSHyong Youb Kimthe driver, the user requests its use via ``enable-avx2-rx=1``, CPU 3998a6ff33dSHyong Youb Kimsupports AVX2, and scatter Rx is not used. To verify that the 4008a6ff33dSHyong Youb Kimvectorized handler is selected, enable debug logging 4018a6ff33dSHyong Youb Kim(``--log-level=pmd,debug``) and check the following message. 4028a6ff33dSHyong Youb Kim 4038a6ff33dSHyong Youb Kim.. code-block:: console 4048a6ff33dSHyong Youb Kim 4058a6ff33dSHyong Youb Kim enic_use_vector_rx_handler use the non-scatter avx2 Rx handler 4068a6ff33dSHyong Youb Kim 4078b428cb5SHyong Youb Kim64B Completion Queue Entry 4088b428cb5SHyong Youb Kim-------------------------- 4098b428cb5SHyong Youb Kim 4108b428cb5SHyong Youb KimRecent VIC adapters support 64B completion queue entries, as well as 4118b428cb5SHyong Youb Kim16B entries that are available on all adapter models. ENIC PMD enables 4128b428cb5SHyong Youb Kimand uses 64B entries by default, if available. 64B entries generally 4138b428cb5SHyong Youb Kimlower CPU cycles per Rx packet, as they avoid partial DMA writes and 4148b428cb5SHyong Youb Kimreduce cache contention between DMA and polling CPU. The effect is 4158b428cb5SHyong Youb Kimmost pronounced when multiple Rx queues are used on Intel platforms 4168b428cb5SHyong Youb Kimwith Data Direct I/O Technology (DDIO). 4178b428cb5SHyong Youb Kim 4188b428cb5SHyong Youb KimIf 64B entries are not available, PMD uses 16B entries. The user may 4198b428cb5SHyong Youb Kimexplicitly disable 64B entries and use 16B entries by setting 4208b428cb5SHyong Youb Kim``devarg`` parameter ``cq64=0``. For example:: 4218b428cb5SHyong Youb Kim 4228b428cb5SHyong Youb Kim -a 12:00.0,cq64=0 4238b428cb5SHyong Youb Kim 4248b428cb5SHyong Youb KimTo verify the selected entry size, enable debug logging 4258b428cb5SHyong Youb Kim(``--log-level=enic,debug``) and check the following messages. 4268b428cb5SHyong Youb Kim 4278b428cb5SHyong Youb Kim.. code-block:: console 4288b428cb5SHyong Youb Kim 4298b428cb5SHyong Youb Kim PMD: rte_enic_pmd: Supported CQ entry sizes: 16 32 4308b428cb5SHyong Youb Kim PMD: rte_enic_pmd: Using 16B CQ entry size 4318b428cb5SHyong Youb Kim 4323c389587SJohn Daley.. _enic_limitations: 4333c389587SJohn Daley 434211f9a9cSJohn DaleyLimitations 435211f9a9cSJohn Daley----------- 436211f9a9cSJohn Daley 437211f9a9cSJohn Daley- **VLAN 0 Priority Tagging** 438211f9a9cSJohn Daley 439211f9a9cSJohn Daley If a vNIC is configured in TRUNK mode by the UCS manager, the adapter will 440211f9a9cSJohn Daley priority tag egress packets according to 802.1Q if they were not already 441211f9a9cSJohn Daley VLAN tagged by software. If the adapter is connected to a properly configured 442211f9a9cSJohn Daley switch, there will be no unexpected behavior. 443211f9a9cSJohn Daley 444211f9a9cSJohn Daley In test setups where an Ethernet port of a Cisco adapter in TRUNK mode is 445211f9a9cSJohn Daley connected point-to-point to another adapter port or connected though a router 446211f9a9cSJohn Daley instead of a switch, all ingress packets will be VLAN tagged. Programs such 447e39c2756SHyong Youb Kim as l3fwd may not account for VLAN tags in packets and may misbehave. One 448e39c2756SHyong Youb Kim solution is to enable VLAN stripping on ingress so the VLAN tag is removed 449e39c2756SHyong Youb Kim from the packet and put into the mbuf->vlan_tci field. Here is an example 450e39c2756SHyong Youb Kim of how to accomplish this: 451211f9a9cSJohn Daley 452211f9a9cSJohn Daley.. code-block:: console 453211f9a9cSJohn Daley 454211f9a9cSJohn Daley vlan_offload = rte_eth_dev_get_vlan_offload(port); 455295968d1SFerruh Yigit vlan_offload |= RTE_ETH_VLAN_STRIP_OFFLOAD; 456211f9a9cSJohn Daley rte_eth_dev_set_vlan_offload(port, vlan_offload); 457211f9a9cSJohn Daley 458e39c2756SHyong Youb KimAnother alternative is modify the adapter's ingress VLAN rewrite mode so that 459e39c2756SHyong Youb Kimpackets with the default VLAN tag are stripped by the adapter and presented to 460daa02b5cSOlivier MatzDPDK as untagged packets. In this case mbuf->vlan_tci and the RTE_MBUF_F_RX_VLAN and 461daa02b5cSOlivier MatzRTE_MBUF_F_RX_VLAN_STRIPPED mbuf flags would not be set. This mode is enabled with the 462dd35c0d6SHyong Youb Kim``devargs`` parameter ``ig-vlan-rewrite=untag``. For example:: 463e39c2756SHyong Youb Kim 464db27370bSStephen Hemminger -a 12:00.0,ig-vlan-rewrite=untag 465e39c2756SHyong Youb Kim 4663c389587SJohn Daley- **SR-IOV** 4673c389587SJohn Daley 4683c389587SJohn Daley - KVM hypervisor support only. VMware has not been tested. 4693c389587SJohn Daley - Requires VM-FEX, and so is only available on UCS managed servers connected 4703c389587SJohn Daley to Fabric Interconnects. It is not on standalone C-Series servers. 4713c389587SJohn Daley - VF devices are not usable directly from the host. They can only be used 4723c389587SJohn Daley as assigned devices on VM instances. 473d4f954b1SJohn Daley - Currently, unbind of the ENIC kernel mode driver 'enic.ko' on the VM 4741509e07fSStephen Hemminger instance may hang. As a workaround, enic.ko should be blocked or removed 475d4f954b1SJohn Daley from the boot process. 4763c389587SJohn Daley - pci_generic cannot be used as the uio module in the VM. igb_uio or 4773c389587SJohn Daley vfio in non-IOMMU mode can be used. 4783c389587SJohn Daley - The number of RQs in UCSM dynamic vNIC configurations must be at least 2. 4793c389587SJohn Daley - The number of SR-IOV devices is limited to 256. Components on target system 4803c389587SJohn Daley might limit this number to fewer than 256. 4813c389587SJohn Daley 4820543f9d2SJohn Daley- **Flow API** 4830543f9d2SJohn Daley 4840543f9d2SJohn Daley - The number of filters that can be specified with the Generic Flow API is 4850543f9d2SJohn Daley dependent on how many header fields are being masked. Use 'flow create' in 4860543f9d2SJohn Daley a loop to determine how many filters your VIC will support (not more than 4875d4f3ad6SHyong Youb Kim 1000 for 1300 series VICs). Filters are checked for matching in the order they 4880543f9d2SJohn Daley were added. Since there currently is no grouping or priority support, 4890543f9d2SJohn Daley 'catch-all' filters should be added last. 490e7347a8aSHyong Youb Kim - The supported range of IDs for the 'MARK' action is 0 - 0xFFFD. 4915af7af4dSHyong Youb Kim - RSS and PASSTHRU actions only support "receive normally". They are limited 4925af7af4dSHyong Youb Kim to supporting MARK + RSS and PASSTHRU + MARK to allow the application to mark 4935af7af4dSHyong Youb Kim packets and then receive them normally. These require 1400 series VIC adapters 4945af7af4dSHyong Youb Kim and latest firmware. 495477959e6SHyong Youb Kim - RAW items are limited to matching UDP tunnel headers like VXLAN. 496db79f2d5SJohn Daley - GTP, GTP-C and GTP-U header matching is enabled, however matching items within 497db79f2d5SJohn Daley the tunnel is not supported. 498510aecabSJohn Daley - For 1400 VICs, all flows using the RSS action on a port use same hash 499510aecabSJohn Daley configuration. The RETA is ignored. The queues used in the RSS group must be 500510aecabSJohn Daley sequential. There is a performance hit if the number of queues is not a power of 2. 501510aecabSJohn Daley Only level 0 (outer header) RSS is allowed. 5020543f9d2SJohn Daley 503368aaa7dSHyong Youb Kim- **Statistics** 504368aaa7dSHyong Youb Kim 505368aaa7dSHyong Youb Kim - ``rx_good_bytes`` (ibytes) always includes VLAN header (4B) and CRC bytes (4B). 50608df773fSHyong Youb Kim This behavior applies to 1300 and older series VIC adapters. 507dd35c0d6SHyong Youb Kim 1400 series VICs do not count CRC bytes, and count VLAN header only when VLAN 508dd35c0d6SHyong Youb Kim stripping is disabled. 509368aaa7dSHyong Youb Kim - When the NIC drops a packet because the Rx queue has no free buffers, 510368aaa7dSHyong Youb Kim ``rx_good_bytes`` still increments by 4B if the packet is not VLAN tagged or 511368aaa7dSHyong Youb Kim VLAN stripping is disabled, or by 8B if the packet is VLAN tagged and stripping 512dd35c0d6SHyong Youb Kim is enabled. 513dd35c0d6SHyong Youb Kim This behavior applies to 1300 and older series VIC adapters. 1400 series VICs 514dd35c0d6SHyong Youb Kim do not increment this byte counter when packets are dropped. 51508df773fSHyong Youb Kim 51608df773fSHyong Youb Kim- **RSS Hashing** 51708df773fSHyong Youb Kim 51808df773fSHyong Youb Kim - Hardware enables and disables UDP and TCP RSS hashing together. The driver 51908df773fSHyong Youb Kim cannot control UDP and TCP hashing individually. 520368aaa7dSHyong Youb Kim 521d4f954b1SJohn DaleyHow to build the suite 522d4f954b1SJohn Daley---------------------- 5233c389587SJohn Daley 5240543f9d2SJohn DaleyThe build instructions for the DPDK suite should be followed. By default 5250543f9d2SJohn Daleythe ENIC PMD library will be built into the DPDK library. 5260543f9d2SJohn Daley 527d4f954b1SJohn DaleyRefer to the document :ref:`compiling and testing a PMD for a NIC 528d4f954b1SJohn Daley<pmd_build_and_test>` for details. 529211f9a9cSJohn Daley 530d4f954b1SJohn DaleyFor configuring and using UIO and VFIO frameworks, please refer to the 531211f9a9cSJohn Daleydocumentation that comes with DPDK suite. 532211f9a9cSJohn Daley 533211f9a9cSJohn DaleySupported Operating Systems 534211f9a9cSJohn Daley--------------------------- 535d4f954b1SJohn Daley 536211f9a9cSJohn DaleyAny Linux distribution fulfilling the conditions described in Dependencies 537211f9a9cSJohn Daleysection of DPDK documentation. 538211f9a9cSJohn Daley 539d4f954b1SJohn DaleyKnown bugs and unsupported features in this release 540211f9a9cSJohn Daley--------------------------------------------------- 541d4f954b1SJohn Daley 542211f9a9cSJohn Daley- Signature or flex byte based flow direction 543211f9a9cSJohn Daley- Drop feature of flow direction 544211f9a9cSJohn Daley- VLAN based flow direction 5455d4f3ad6SHyong Youb Kim- Non-IPV4 flow direction 546211f9a9cSJohn Daley- Setting of extended VLAN 547f0796303SJohn Daley- MTU update only works if Scattered Rx mode is disabled 548422ba917SHyong Youb Kim- Maximum receive packet length is ignored if Scattered Rx mode is used 549211f9a9cSJohn Daley 550211f9a9cSJohn DaleyPrerequisites 551211f9a9cSJohn Daley------------- 552d4f954b1SJohn Daley 553211f9a9cSJohn Daley- Prepare the system as recommended by DPDK suite. This includes environment 5545d4f3ad6SHyong Youb Kim variables, hugepages configuration, tool-chains and configuration. 555211f9a9cSJohn Daley- Insert vfio-pci kernel module using the command 'modprobe vfio-pci' if the 5565d4f3ad6SHyong Youb Kim user wants to use VFIO framework. 557211f9a9cSJohn Daley- Insert uio kernel module using the command 'modprobe uio' if the user wants 5585d4f3ad6SHyong Youb Kim to use UIO framework. 559211f9a9cSJohn Daley- DPDK suite should be configured based on the user's decision to use VFIO or 5605d4f3ad6SHyong Youb Kim UIO framework. 561211f9a9cSJohn Daley- If the vNIC device(s) to be used is bound to the kernel mode Ethernet driver 5625d4f3ad6SHyong Youb Kim use 'ip' to bring the interface down. The dpdk-devbind.py tool can 563d4f954b1SJohn Daley then be used to unbind the device's bus id from the ENIC kernel mode driver. 564211f9a9cSJohn Daley- Bind the intended vNIC to vfio-pci in case the user wants ENIC PMD to use 565a5d7a3f7SThomas Monjalon VFIO framework using dpdk-devbind.py. 566211f9a9cSJohn Daley- Bind the intended vNIC to igb_uio in case the user wants ENIC PMD to use 567a5d7a3f7SThomas Monjalon UIO framework using dpdk-devbind.py. 568211f9a9cSJohn Daley 569211f9a9cSJohn DaleyAt this point the system should be ready to run DPDK applications. Once the 570211f9a9cSJohn Daleyapplication runs to completion, the vNIC can be detached from vfio-pci or 571211f9a9cSJohn Daleyigb_uio if necessary. 572211f9a9cSJohn Daley 573211f9a9cSJohn DaleyRoot privilege is required to bind and unbind vNICs to/from VFIO/UIO. 574211f9a9cSJohn DaleyVFIO framework helps an unprivileged user to run the applications. 575211f9a9cSJohn DaleyFor an unprivileged user to run the applications on DPDK and ENIC PMD, 576211f9a9cSJohn Daleyit may be necessary to increase the maximum locked memory of the user. 577211f9a9cSJohn DaleyThe following command could be used to do this. 578211f9a9cSJohn Daley 579211f9a9cSJohn Daley.. code-block:: console 580211f9a9cSJohn Daley 581211f9a9cSJohn Daley sudo sh -c "ulimit -l <value in Kilo Bytes>" 582211f9a9cSJohn Daley 583211f9a9cSJohn DaleyThe value depends on the memory configuration of the application, DPDK and 584211f9a9cSJohn DaleyPMD. Typically, the limit has to be raised to higher than 2GB. 585211f9a9cSJohn Daleye.g., 2621440 586211f9a9cSJohn Daley 587211f9a9cSJohn DaleyAdditional Reference 588211f9a9cSJohn Daley-------------------- 589d4f954b1SJohn Daley 5905d4f3ad6SHyong Youb Kim- https://www.cisco.com/c/en/us/products/servers-unified-computing/index.html 5915d4f3ad6SHyong Youb Kim- https://www.cisco.com/c/en/us/products/interfaces-modules/unified-computing-system-adapters/index.html 592