1How to get best performance with NICs on Intel platforms 2======================================================== 3 4This document is a step-by-step guide for getting high performance from DPDK applications on Intel platforms. 5 6 7Hardware and Memory Requirements 8-------------------------------- 9 10For best performance use an Intel Xeon class server system such as Ivy Bridge, Haswell or newer. 11 12Ensure that each memory channel has at least one memory DIMM inserted, and that the memory size for each is at least 4GB. 13**Note**: this has one of the most direct effects on performance. 14 15You can check the memory configuration using ``dmidecode`` as follows:: 16 17 dmidecode -t memory | grep Locator 18 19 Locator: DIMM_A1 20 Bank Locator: NODE 1 21 Locator: DIMM_A2 22 Bank Locator: NODE 1 23 Locator: DIMM_B1 24 Bank Locator: NODE 1 25 Locator: DIMM_B2 26 Bank Locator: NODE 1 27 ... 28 Locator: DIMM_G1 29 Bank Locator: NODE 2 30 Locator: DIMM_G2 31 Bank Locator: NODE 2 32 Locator: DIMM_H1 33 Bank Locator: NODE 2 34 Locator: DIMM_H2 35 Bank Locator: NODE 2 36 37The sample output above shows a total of 8 channels, from ``A`` to ``H``, where each channel has 2 DIMMs. 38 39You can also use ``dmidecode`` to determine the memory frequency:: 40 41 dmidecode -t memory | grep Speed 42 43 Speed: 2133 MHz 44 Configured Clock Speed: 2134 MHz 45 Speed: Unknown 46 Configured Clock Speed: Unknown 47 Speed: 2133 MHz 48 Configured Clock Speed: 2134 MHz 49 Speed: Unknown 50 ... 51 Speed: 2133 MHz 52 Configured Clock Speed: 2134 MHz 53 Speed: Unknown 54 Configured Clock Speed: Unknown 55 Speed: 2133 MHz 56 Configured Clock Speed: 2134 MHz 57 Speed: Unknown 58 Configured Clock Speed: Unknown 59 60The output shows a speed of 2133 MHz (DDR4) and Unknown (not existing). 61This aligns with the previous output which showed that each channel has one memory bar. 62 63 64Network Interface Card Requirements 65~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 66 67Use a `DPDK supported <http://dpdk.org/doc/nics>`_ high end NIC such as the Intel XL710 40GbE. 68 69Make sure each NIC has been flashed the latest version of NVM/firmware. 70 71Use PCIe Gen3 slots, such as Gen3 ``x8`` or Gen3 ``x16`` because PCIe Gen2 slots don't provide enough bandwidth 72for 2 x 10GbE and above. 73You can use ``lspci`` to check the speed of a PCI slot using something like the following:: 74 75 lspci -s 03:00.1 -vv | grep LnkSta 76 77 LnkSta: Speed 8GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- ... 78 LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+ ... 79 80When inserting NICs into PCI slots always check the caption, such as CPU0 or CPU1 to indicate which socket it is connected to. 81 82Care should be take with NUMA. 83If you are using 2 or more ports from different NICs, it is best to ensure that these NICs are on the same CPU socket. 84An example of how to determine this is shown further below. 85 86 87BIOS Settings 88~~~~~~~~~~~~~ 89 90The following are some recommendations on BIOS settings. Different platforms will have different BIOS naming 91so the following is mainly for reference: 92 93#. Before starting consider resetting all BIOS settings to their default. 94 95#. Disable all power saving options such as: Power performance tuning, CPU P-State, CPU C3 Report and CPU C6 Report. 96 97#. Select **Performance** as the CPU Power and Performance policy. 98 99#. Disable Turbo Boost to ensure the performance scaling increases with the number of cores. 100 101#. Set memory frequency to the highest available number, NOT auto. 102 103#. Disable all virtualization options when you test the physical function of the NIC, and turn on ``VT-d`` if you wants to use VFIO. 104 105 106Linux boot command line 107~~~~~~~~~~~~~~~~~~~~~~~ 108 109The following are some recommendations on GRUB boot settings: 110 111#. Use the default grub file as a starting point. 112 113#. Reserve 1G huge pages via grub configurations. For example to reserve 8 huge pages of 1G size:: 114 115 default_hugepagesz=1G hugepagesz=1G hugepages=8 116 117#. Isolate CPU cores which will be used for DPDK. For example:: 118 119 isolcpus=2,3,4,5,6,7,8 120 121#. If it wants to use VFIO, use the following additional grub parameters:: 122 123 iommu=pt intel_iommu=on 124 125 126Configurations before running DPDK 127---------------------------------- 128 1291. Build the DPDK target and reserve huge pages. 130 See the earlier section on :ref:`linux_gsg_hugepages` for more details. 131 132 The following shell commands may help with building and configuration: 133 134 .. code-block:: console 135 136 # Build DPDK target. 137 cd dpdk_folder 138 make install T=x86_64-native-linuxapp-gcc -j 139 140 # Get the hugepage size. 141 awk '/Hugepagesize/ {print $2}' /proc/meminfo 142 143 # Get the total huge page numbers. 144 awk '/HugePages_Total/ {print $2} ' /proc/meminfo 145 146 # Unmount the hugepages. 147 umount `awk '/hugetlbfs/ {print $2}' /proc/mounts` 148 149 # Create the hugepage mount folder. 150 mkdir -p /mnt/huge 151 152 # Mount to the specific folder. 153 mount -t hugetlbfs nodev /mnt/huge 154 1552. Check the CPU layout using using the DPDK ``cpu_layout`` utility: 156 157 .. code-block:: console 158 159 cd dpdk_folder 160 161 tools/cpu_layout.py 162 163 Or run ``lscpu`` to check the the cores on each socket. 164 1653. Check your NIC id and related socket id: 166 167 .. code-block:: console 168 169 # List all the NICs with PCI address and device IDs. 170 lspci -nn | grep Eth 171 172 For example suppose your output was as follows:: 173 174 82:00.0 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583] 175 82:00.1 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583] 176 85:00.0 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583] 177 85:00.1 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583] 178 179 Check the PCI device related numa node id: 180 181 .. code-block:: console 182 183 cat /sys/bus/pci/devices/0000\:xx\:00.x/numa_node 184 185 Usually ``0x:00.x`` is on socket 0 and ``8x:00.x`` is on socket 1. 186 **Note**: To get the best performance, ensure that the core and NICs are in the same socket. 187 In the example above ``85:00.0`` is on socket 1 and should be used by cores on socket 1 for the best performance. 188 1894. Bind the test ports to DPDK compatible drivers, such as igb_uio. For example bind two ports to a DPDK compatible driver and check the status: 190 191 .. code-block:: console 192 193 194 # Bind ports 82:00.0 and 85:00.0 to dpdk driver 195 ./dpdk_folder/tools/dpdk-devbind.py -b igb_uio 82:00.0 85:00.0 196 197 # Check the port driver status 198 ./dpdk_folder/tools/dpdk-devbind.py --status 199 200 See ``dpdk-devbind.py --help`` for more details. 201 202 203More details about DPDK setup and Linux kernel requirements see :ref:`linux_gsg_compiling_dpdk`. 204 205 206Example of getting best performance for an Intel NIC 207---------------------------------------------------- 208 209The following is an example of running the DPDK ``l3fwd`` sample application to get high performance with an 210Intel server platform and Intel XL710 NICs. 211For specific 40G NIC configuration please refer to the i40e NIC guide. 212 213The example scenario is to get best performance with two Intel XL710 40GbE ports. 214See :numref:`figure_intel_perf_test_setup` for the performance test setup. 215 216.. _figure_intel_perf_test_setup: 217 218.. figure:: img/intel_perf_test_setup.* 219 220 Performance Test Setup 221 222 2231. Add two Intel XL710 NICs to the platform, and use one port per card to get best performance. 224 The reason for using two NICs is to overcome a PCIe Gen3's limitation since it cannot provide 80G bandwidth 225 for two 40G ports, but two different PCIe Gen3 x8 slot can. 226 Refer to the sample NICs output above, then we can select ``82:00.0`` and ``85:00.0`` as test ports:: 227 228 82:00.0 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583] 229 85:00.0 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583] 230 2312. Connect the ports to the traffic generator. For high speed testing, it's best to use a hardware traffic generator. 232 2333. Check the PCI devices numa node (socket id) and get the cores number on the exact socket id. 234 In this case, ``82:00.0`` and ``85:00.0`` are both in socket 1, and the cores on socket 1 in the referenced platform 235 are 18-35 and 54-71. 236 Note: Don't use 2 logical cores on the same core (e.g core18 has 2 logical cores, core18 and core54), instead, use 2 logical 237 cores from different cores (e.g core18 and core19). 238 2394. Bind these two ports to igb_uio. 240 2415. As to XL710 40G port, we need at least two queue pairs to achieve best performance, then two queues per port 242 will be required, and each queue pair will need a dedicated CPU core for receiving/transmitting packets. 243 2446. The DPDK sample application ``l3fwd`` will be used for performance testing, with using two ports for bi-directional forwarding. 245 Compile the ``l3fwd sample`` with the default lpm mode. 246 2477. The command line of running l3fwd would be something like the followings:: 248 249 ./l3fwd -c 0x3c0000 -n 4 -w 82:00.0 -w 85:00.0 \ 250 -- -p 0x3 --config '(0,0,18),(0,1,19),(1,0,20),(1,1,21)' 251 252 This means that the application uses core 18 for port 0, queue pair 0 forwarding, core 19 for port 0, queue pair 1 forwarding, 253 core 20 for port 1, queue pair 0 forwarding, and core 21 for port 1, queue pair 1 forwarding. 254 255 2568. Configure the traffic at a traffic generator. 257 258 * Start creating a stream on packet generator. 259 260 * Set the Ethernet II type to 0x0800. 261