xref: /dpdk/doc/guides/linux_gsg/nic_perf_intel_platform.rst (revision 443b949e17953a1094f80532d600a1ee540f2ba4)
1a0abf705SStephen Hemminger..  SPDX-License-Identifier: BSD-3-Clause
2a0abf705SStephen Hemminger    Copyright(c) 2015 Intel Corporation.
3a0abf705SStephen Hemminger
4b932ebcbSQian XuHow to get best performance with NICs on Intel platforms
5b932ebcbSQian Xu========================================================
6b932ebcbSQian Xu
7b932ebcbSQian XuThis document is a step-by-step guide for getting high performance from DPDK applications on Intel platforms.
8b932ebcbSQian Xu
9b932ebcbSQian Xu
10b932ebcbSQian XuHardware and Memory Requirements
11b932ebcbSQian Xu--------------------------------
12b932ebcbSQian Xu
13b932ebcbSQian XuFor best performance use an Intel Xeon class server system such as Ivy Bridge, Haswell or newer.
14b932ebcbSQian Xu
15b932ebcbSQian XuEnsure that each memory channel has at least one memory DIMM inserted, and that the memory size for each is at least 4GB.
16b932ebcbSQian Xu**Note**: this has one of the most direct effects on performance.
17b932ebcbSQian Xu
18b932ebcbSQian XuYou can check the memory configuration using ``dmidecode`` as follows::
19b932ebcbSQian Xu
20b932ebcbSQian Xu      dmidecode -t memory | grep Locator
21b932ebcbSQian Xu
22b932ebcbSQian Xu      Locator: DIMM_A1
23b932ebcbSQian Xu      Bank Locator: NODE 1
24b932ebcbSQian Xu      Locator: DIMM_A2
25b932ebcbSQian Xu      Bank Locator: NODE 1
26b932ebcbSQian Xu      Locator: DIMM_B1
27b932ebcbSQian Xu      Bank Locator: NODE 1
28b932ebcbSQian Xu      Locator: DIMM_B2
29b932ebcbSQian Xu      Bank Locator: NODE 1
30b932ebcbSQian Xu      ...
31b932ebcbSQian Xu      Locator: DIMM_G1
32b932ebcbSQian Xu      Bank Locator: NODE 2
33b932ebcbSQian Xu      Locator: DIMM_G2
34b932ebcbSQian Xu      Bank Locator: NODE 2
35b932ebcbSQian Xu      Locator: DIMM_H1
36b932ebcbSQian Xu      Bank Locator: NODE 2
37b932ebcbSQian Xu      Locator: DIMM_H2
38b932ebcbSQian Xu      Bank Locator: NODE 2
39b932ebcbSQian Xu
40b932ebcbSQian XuThe sample output above shows a total of 8 channels, from ``A`` to ``H``, where each channel has 2 DIMMs.
41b932ebcbSQian Xu
42b932ebcbSQian XuYou can also use ``dmidecode`` to determine the memory frequency::
43b932ebcbSQian Xu
44b932ebcbSQian Xu      dmidecode -t memory | grep Speed
45b932ebcbSQian Xu
46b932ebcbSQian Xu      Speed: 2133 MHz
47b932ebcbSQian Xu      Configured Clock Speed: 2134 MHz
48b932ebcbSQian Xu      Speed: Unknown
49b932ebcbSQian Xu      Configured Clock Speed: Unknown
50b932ebcbSQian Xu      Speed: 2133 MHz
51b932ebcbSQian Xu      Configured Clock Speed: 2134 MHz
52b932ebcbSQian Xu      Speed: Unknown
53b932ebcbSQian Xu      ...
54b932ebcbSQian Xu      Speed: 2133 MHz
55b932ebcbSQian Xu      Configured Clock Speed: 2134 MHz
56b932ebcbSQian Xu      Speed: Unknown
57b932ebcbSQian Xu      Configured Clock Speed: Unknown
58b932ebcbSQian Xu      Speed: 2133 MHz
59b932ebcbSQian Xu      Configured Clock Speed: 2134 MHz
60b932ebcbSQian Xu      Speed: Unknown
61b932ebcbSQian Xu      Configured Clock Speed: Unknown
62b932ebcbSQian Xu
63b932ebcbSQian XuThe output shows a speed of 2133 MHz (DDR4) and Unknown (not existing).
64b932ebcbSQian XuThis aligns with the previous output which showed that each channel has one memory bar.
65b932ebcbSQian Xu
66b932ebcbSQian Xu
67b932ebcbSQian XuNetwork Interface Card Requirements
68b932ebcbSQian Xu~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
69b932ebcbSQian Xu
703d4b2afbSDavid MarchandUse a `DPDK supported <https://core.dpdk.org/supported/>`_ high end NIC such as the Intel XL710 40GbE.
71b932ebcbSQian Xu
72b932ebcbSQian XuMake sure each NIC has been flashed the latest version of NVM/firmware.
73b932ebcbSQian Xu
74b932ebcbSQian XuUse PCIe Gen3 slots, such as Gen3 ``x8`` or Gen3 ``x16`` because PCIe Gen2 slots don't provide enough bandwidth
75b932ebcbSQian Xufor 2 x 10GbE and above.
76b932ebcbSQian XuYou can use ``lspci`` to check the speed of a PCI slot using something like the following::
77b932ebcbSQian Xu
78b932ebcbSQian Xu      lspci -s 03:00.1 -vv | grep LnkSta
79b932ebcbSQian Xu
80b932ebcbSQian Xu      LnkSta: Speed 8GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- ...
81b932ebcbSQian Xu      LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+ ...
82b932ebcbSQian Xu
83b932ebcbSQian XuWhen inserting NICs into PCI slots always check the caption, such as CPU0 or CPU1 to indicate which socket it is connected to.
84b932ebcbSQian Xu
85b932ebcbSQian XuCare should be take with NUMA.
86b932ebcbSQian XuIf you are using 2 or more ports from different NICs, it is best to ensure that these NICs are on the same CPU socket.
87b932ebcbSQian XuAn example of how to determine this is shown further below.
88b932ebcbSQian Xu
89b932ebcbSQian Xu
90b932ebcbSQian XuBIOS Settings
91b932ebcbSQian Xu~~~~~~~~~~~~~
92b932ebcbSQian Xu
93b932ebcbSQian XuThe following are some recommendations on BIOS settings. Different platforms will have different BIOS naming
94b932ebcbSQian Xuso the following is mainly for reference:
95b932ebcbSQian Xu
96be1aab71SDavid Hunt#. Establish the steady state for the system, consider reviewing BIOS settings desired for best performance characteristic e.g. optimize for performance or energy efficiency.
97b932ebcbSQian Xu
98be1aab71SDavid Hunt#. Match the BIOS settings to the needs of the application you are testing.
99b932ebcbSQian Xu
100be1aab71SDavid Hunt#. Typically, **Performance** as the CPU Power and Performance policy is a reasonable starting point.
101b932ebcbSQian Xu
102be1aab71SDavid Hunt#. Consider using Turbo Boost to increase the frequency on cores.
103b932ebcbSQian Xu
104be1aab71SDavid Hunt#. Disable all virtualization options when you test the physical function of the NIC, and turn on VT-d if you wants to use VFIO.
105b932ebcbSQian Xu
106b932ebcbSQian Xu
107b932ebcbSQian XuLinux boot command line
108b932ebcbSQian Xu~~~~~~~~~~~~~~~~~~~~~~~
109b932ebcbSQian Xu
110b932ebcbSQian XuThe following are some recommendations on GRUB boot settings:
111b932ebcbSQian Xu
112b932ebcbSQian Xu#. Use the default grub file as a starting point.
113b932ebcbSQian Xu
114b932ebcbSQian Xu#. Reserve 1G huge pages via grub configurations. For example to reserve 8 huge pages of 1G size::
115b932ebcbSQian Xu
116b932ebcbSQian Xu      default_hugepagesz=1G hugepagesz=1G hugepages=8
117b932ebcbSQian Xu
118b932ebcbSQian Xu#. Isolate CPU cores which will be used for DPDK. For example::
119b932ebcbSQian Xu
120b932ebcbSQian Xu      isolcpus=2,3,4,5,6,7,8
121b932ebcbSQian Xu
122b932ebcbSQian Xu#. If it wants to use VFIO, use the following additional grub parameters::
123b932ebcbSQian Xu
124b932ebcbSQian Xu      iommu=pt intel_iommu=on
125b932ebcbSQian Xu
126b932ebcbSQian Xu
127b932ebcbSQian XuConfigurations before running DPDK
128b932ebcbSQian Xu----------------------------------
129b932ebcbSQian Xu
130*443b949eSDavid Marchand#. Reserve huge pages.
131b932ebcbSQian Xu   See the earlier section on :ref:`linux_gsg_hugepages` for more details.
132b932ebcbSQian Xu
133b932ebcbSQian Xu   .. code-block:: console
134b932ebcbSQian Xu
135b932ebcbSQian Xu      # Get the hugepage size.
136b932ebcbSQian Xu      awk '/Hugepagesize/ {print $2}' /proc/meminfo
137b932ebcbSQian Xu
138b932ebcbSQian Xu      # Get the total huge page numbers.
139b932ebcbSQian Xu      awk '/HugePages_Total/ {print $2} ' /proc/meminfo
140b932ebcbSQian Xu
141b932ebcbSQian Xu      # Unmount the hugepages.
142b932ebcbSQian Xu      umount `awk '/hugetlbfs/ {print $2}' /proc/mounts`
143b932ebcbSQian Xu
144b932ebcbSQian Xu      # Create the hugepage mount folder.
145b932ebcbSQian Xu      mkdir -p /mnt/huge
146b932ebcbSQian Xu
147b932ebcbSQian Xu      # Mount to the specific folder.
148b932ebcbSQian Xu      mount -t hugetlbfs nodev /mnt/huge
149b932ebcbSQian Xu
150*443b949eSDavid Marchand#. Check the CPU layout using the DPDK ``cpu_layout`` utility:
151b932ebcbSQian Xu
152b932ebcbSQian Xu   .. code-block:: console
153b932ebcbSQian Xu
154b932ebcbSQian Xu      cd dpdk_folder
155b932ebcbSQian Xu
156c6dab2a8SThomas Monjalon      usertools/cpu_layout.py
157b932ebcbSQian Xu
1588f87ba70SThierry Herbelot   Or run ``lscpu`` to check the cores on each socket.
159b932ebcbSQian Xu
160*443b949eSDavid Marchand#. Check your NIC id and related socket id:
161b932ebcbSQian Xu
162b932ebcbSQian Xu   .. code-block:: console
163b932ebcbSQian Xu
164b932ebcbSQian Xu      # List all the NICs with PCI address and device IDs.
165b932ebcbSQian Xu      lspci -nn | grep Eth
166b932ebcbSQian Xu
167b932ebcbSQian Xu   For example suppose your output was as follows::
168b932ebcbSQian Xu
169b932ebcbSQian Xu      82:00.0 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583]
170b932ebcbSQian Xu      82:00.1 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583]
171b932ebcbSQian Xu      85:00.0 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583]
172b932ebcbSQian Xu      85:00.1 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583]
173b932ebcbSQian Xu
174b932ebcbSQian Xu   Check the PCI device related numa node id:
175b932ebcbSQian Xu
176b932ebcbSQian Xu   .. code-block:: console
177b932ebcbSQian Xu
178b932ebcbSQian Xu      cat /sys/bus/pci/devices/0000\:xx\:00.x/numa_node
179b932ebcbSQian Xu
180b932ebcbSQian Xu   Usually ``0x:00.x`` is on socket 0 and ``8x:00.x`` is on socket 1.
181b932ebcbSQian Xu   **Note**: To get the best performance, ensure that the core and NICs are in the same socket.
182b932ebcbSQian Xu   In the example above ``85:00.0`` is on socket 1 and should be used by cores on socket 1 for the best performance.
183b932ebcbSQian Xu
184*443b949eSDavid Marchand#. Check which kernel drivers needs to be loaded and whether there is a need to unbind the network ports from their kernel drivers.
1850db52e66SShahaf Shuler   More details about DPDK setup and Linux kernel requirements see :ref:`linux_gsg_compiling_dpdk` and :ref:`linux_gsg_linux_drivers`.
186