1a0abf705SStephen Hemminger.. SPDX-License-Identifier: BSD-3-Clause 2a0abf705SStephen Hemminger Copyright(c) 2015 Intel Corporation. 3a0abf705SStephen Hemminger 4b932ebcbSQian XuHow to get best performance with NICs on Intel platforms 5b932ebcbSQian Xu======================================================== 6b932ebcbSQian Xu 7b932ebcbSQian XuThis document is a step-by-step guide for getting high performance from DPDK applications on Intel platforms. 8b932ebcbSQian Xu 9b932ebcbSQian Xu 10b932ebcbSQian XuHardware and Memory Requirements 11b932ebcbSQian Xu-------------------------------- 12b932ebcbSQian Xu 13b932ebcbSQian XuFor best performance use an Intel Xeon class server system such as Ivy Bridge, Haswell or newer. 14b932ebcbSQian Xu 15b932ebcbSQian XuEnsure that each memory channel has at least one memory DIMM inserted, and that the memory size for each is at least 4GB. 16b932ebcbSQian Xu**Note**: this has one of the most direct effects on performance. 17b932ebcbSQian Xu 18b932ebcbSQian XuYou can check the memory configuration using ``dmidecode`` as follows:: 19b932ebcbSQian Xu 20b932ebcbSQian Xu dmidecode -t memory | grep Locator 21b932ebcbSQian Xu 22b932ebcbSQian Xu Locator: DIMM_A1 23b932ebcbSQian Xu Bank Locator: NODE 1 24b932ebcbSQian Xu Locator: DIMM_A2 25b932ebcbSQian Xu Bank Locator: NODE 1 26b932ebcbSQian Xu Locator: DIMM_B1 27b932ebcbSQian Xu Bank Locator: NODE 1 28b932ebcbSQian Xu Locator: DIMM_B2 29b932ebcbSQian Xu Bank Locator: NODE 1 30b932ebcbSQian Xu ... 31b932ebcbSQian Xu Locator: DIMM_G1 32b932ebcbSQian Xu Bank Locator: NODE 2 33b932ebcbSQian Xu Locator: DIMM_G2 34b932ebcbSQian Xu Bank Locator: NODE 2 35b932ebcbSQian Xu Locator: DIMM_H1 36b932ebcbSQian Xu Bank Locator: NODE 2 37b932ebcbSQian Xu Locator: DIMM_H2 38b932ebcbSQian Xu Bank Locator: NODE 2 39b932ebcbSQian Xu 40b932ebcbSQian XuThe sample output above shows a total of 8 channels, from ``A`` to ``H``, where each channel has 2 DIMMs. 41b932ebcbSQian Xu 42b932ebcbSQian XuYou can also use ``dmidecode`` to determine the memory frequency:: 43b932ebcbSQian Xu 44b932ebcbSQian Xu dmidecode -t memory | grep Speed 45b932ebcbSQian Xu 46b932ebcbSQian Xu Speed: 2133 MHz 47b932ebcbSQian Xu Configured Clock Speed: 2134 MHz 48b932ebcbSQian Xu Speed: Unknown 49b932ebcbSQian Xu Configured Clock Speed: Unknown 50b932ebcbSQian Xu Speed: 2133 MHz 51b932ebcbSQian Xu Configured Clock Speed: 2134 MHz 52b932ebcbSQian Xu Speed: Unknown 53b932ebcbSQian Xu ... 54b932ebcbSQian Xu Speed: 2133 MHz 55b932ebcbSQian Xu Configured Clock Speed: 2134 MHz 56b932ebcbSQian Xu Speed: Unknown 57b932ebcbSQian Xu Configured Clock Speed: Unknown 58b932ebcbSQian Xu Speed: 2133 MHz 59b932ebcbSQian Xu Configured Clock Speed: 2134 MHz 60b932ebcbSQian Xu Speed: Unknown 61b932ebcbSQian Xu Configured Clock Speed: Unknown 62b932ebcbSQian Xu 63b932ebcbSQian XuThe output shows a speed of 2133 MHz (DDR4) and Unknown (not existing). 64b932ebcbSQian XuThis aligns with the previous output which showed that each channel has one memory bar. 65b932ebcbSQian Xu 66b932ebcbSQian Xu 67b932ebcbSQian XuNetwork Interface Card Requirements 68b932ebcbSQian Xu~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 69b932ebcbSQian Xu 703d4b2afbSDavid MarchandUse a `DPDK supported <https://core.dpdk.org/supported/>`_ high end NIC such as the Intel XL710 40GbE. 71b932ebcbSQian Xu 72b932ebcbSQian XuMake sure each NIC has been flashed the latest version of NVM/firmware. 73b932ebcbSQian Xu 74b932ebcbSQian XuUse PCIe Gen3 slots, such as Gen3 ``x8`` or Gen3 ``x16`` because PCIe Gen2 slots don't provide enough bandwidth 75b932ebcbSQian Xufor 2 x 10GbE and above. 76b932ebcbSQian XuYou can use ``lspci`` to check the speed of a PCI slot using something like the following:: 77b932ebcbSQian Xu 78b932ebcbSQian Xu lspci -s 03:00.1 -vv | grep LnkSta 79b932ebcbSQian Xu 80b932ebcbSQian Xu LnkSta: Speed 8GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- ... 81b932ebcbSQian Xu LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+ ... 82b932ebcbSQian Xu 83b932ebcbSQian XuWhen inserting NICs into PCI slots always check the caption, such as CPU0 or CPU1 to indicate which socket it is connected to. 84b932ebcbSQian Xu 85b932ebcbSQian XuCare should be take with NUMA. 86b932ebcbSQian XuIf you are using 2 or more ports from different NICs, it is best to ensure that these NICs are on the same CPU socket. 87b932ebcbSQian XuAn example of how to determine this is shown further below. 88b932ebcbSQian Xu 89b932ebcbSQian Xu 90b932ebcbSQian XuBIOS Settings 91b932ebcbSQian Xu~~~~~~~~~~~~~ 92b932ebcbSQian Xu 93b932ebcbSQian XuThe following are some recommendations on BIOS settings. Different platforms will have different BIOS naming 94b932ebcbSQian Xuso the following is mainly for reference: 95b932ebcbSQian Xu 96be1aab71SDavid Hunt#. Establish the steady state for the system, consider reviewing BIOS settings desired for best performance characteristic e.g. optimize for performance or energy efficiency. 97b932ebcbSQian Xu 98be1aab71SDavid Hunt#. Match the BIOS settings to the needs of the application you are testing. 99b932ebcbSQian Xu 100be1aab71SDavid Hunt#. Typically, **Performance** as the CPU Power and Performance policy is a reasonable starting point. 101b932ebcbSQian Xu 102be1aab71SDavid Hunt#. Consider using Turbo Boost to increase the frequency on cores. 103b932ebcbSQian Xu 104be1aab71SDavid Hunt#. Disable all virtualization options when you test the physical function of the NIC, and turn on VT-d if you wants to use VFIO. 105b932ebcbSQian Xu 106b932ebcbSQian Xu 107b932ebcbSQian XuLinux boot command line 108b932ebcbSQian Xu~~~~~~~~~~~~~~~~~~~~~~~ 109b932ebcbSQian Xu 110b932ebcbSQian XuThe following are some recommendations on GRUB boot settings: 111b932ebcbSQian Xu 112b932ebcbSQian Xu#. Use the default grub file as a starting point. 113b932ebcbSQian Xu 114b932ebcbSQian Xu#. Reserve 1G huge pages via grub configurations. For example to reserve 8 huge pages of 1G size:: 115b932ebcbSQian Xu 116b932ebcbSQian Xu default_hugepagesz=1G hugepagesz=1G hugepages=8 117b932ebcbSQian Xu 118b932ebcbSQian Xu#. Isolate CPU cores which will be used for DPDK. For example:: 119b932ebcbSQian Xu 120b932ebcbSQian Xu isolcpus=2,3,4,5,6,7,8 121b932ebcbSQian Xu 122b932ebcbSQian Xu#. If it wants to use VFIO, use the following additional grub parameters:: 123b932ebcbSQian Xu 124b932ebcbSQian Xu iommu=pt intel_iommu=on 125b932ebcbSQian Xu 126b932ebcbSQian Xu 127b932ebcbSQian XuConfigurations before running DPDK 128b932ebcbSQian Xu---------------------------------- 129b932ebcbSQian Xu 130*443b949eSDavid Marchand#. Reserve huge pages. 131b932ebcbSQian Xu See the earlier section on :ref:`linux_gsg_hugepages` for more details. 132b932ebcbSQian Xu 133b932ebcbSQian Xu .. code-block:: console 134b932ebcbSQian Xu 135b932ebcbSQian Xu # Get the hugepage size. 136b932ebcbSQian Xu awk '/Hugepagesize/ {print $2}' /proc/meminfo 137b932ebcbSQian Xu 138b932ebcbSQian Xu # Get the total huge page numbers. 139b932ebcbSQian Xu awk '/HugePages_Total/ {print $2} ' /proc/meminfo 140b932ebcbSQian Xu 141b932ebcbSQian Xu # Unmount the hugepages. 142b932ebcbSQian Xu umount `awk '/hugetlbfs/ {print $2}' /proc/mounts` 143b932ebcbSQian Xu 144b932ebcbSQian Xu # Create the hugepage mount folder. 145b932ebcbSQian Xu mkdir -p /mnt/huge 146b932ebcbSQian Xu 147b932ebcbSQian Xu # Mount to the specific folder. 148b932ebcbSQian Xu mount -t hugetlbfs nodev /mnt/huge 149b932ebcbSQian Xu 150*443b949eSDavid Marchand#. Check the CPU layout using the DPDK ``cpu_layout`` utility: 151b932ebcbSQian Xu 152b932ebcbSQian Xu .. code-block:: console 153b932ebcbSQian Xu 154b932ebcbSQian Xu cd dpdk_folder 155b932ebcbSQian Xu 156c6dab2a8SThomas Monjalon usertools/cpu_layout.py 157b932ebcbSQian Xu 1588f87ba70SThierry Herbelot Or run ``lscpu`` to check the cores on each socket. 159b932ebcbSQian Xu 160*443b949eSDavid Marchand#. Check your NIC id and related socket id: 161b932ebcbSQian Xu 162b932ebcbSQian Xu .. code-block:: console 163b932ebcbSQian Xu 164b932ebcbSQian Xu # List all the NICs with PCI address and device IDs. 165b932ebcbSQian Xu lspci -nn | grep Eth 166b932ebcbSQian Xu 167b932ebcbSQian Xu For example suppose your output was as follows:: 168b932ebcbSQian Xu 169b932ebcbSQian Xu 82:00.0 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583] 170b932ebcbSQian Xu 82:00.1 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583] 171b932ebcbSQian Xu 85:00.0 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583] 172b932ebcbSQian Xu 85:00.1 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583] 173b932ebcbSQian Xu 174b932ebcbSQian Xu Check the PCI device related numa node id: 175b932ebcbSQian Xu 176b932ebcbSQian Xu .. code-block:: console 177b932ebcbSQian Xu 178b932ebcbSQian Xu cat /sys/bus/pci/devices/0000\:xx\:00.x/numa_node 179b932ebcbSQian Xu 180b932ebcbSQian Xu Usually ``0x:00.x`` is on socket 0 and ``8x:00.x`` is on socket 1. 181b932ebcbSQian Xu **Note**: To get the best performance, ensure that the core and NICs are in the same socket. 182b932ebcbSQian Xu In the example above ``85:00.0`` is on socket 1 and should be used by cores on socket 1 for the best performance. 183b932ebcbSQian Xu 184*443b949eSDavid Marchand#. Check which kernel drivers needs to be loaded and whether there is a need to unbind the network ports from their kernel drivers. 1850db52e66SShahaf Shuler More details about DPDK setup and Linux kernel requirements see :ref:`linux_gsg_compiling_dpdk` and :ref:`linux_gsg_linux_drivers`. 186