1.. BSD LICENSE 2 Copyright(c) 2010-2014 Intel Corporation. All rights reserved. 3 All rights reserved. 4 5 Redistribution and use in source and binary forms, with or without 6 modification, are permitted provided that the following conditions 7 are met: 8 9 * Redistributions of source code must retain the above copyright 10 notice, this list of conditions and the following disclaimer. 11 * Redistributions in binary form must reproduce the above copyright 12 notice, this list of conditions and the following disclaimer in 13 the documentation and/or other materials provided with the 14 distribution. 15 * Neither the name of Intel Corporation nor the names of its 16 contributors may be used to endorse or promote products derived 17 from this software without specific prior written permission. 18 19 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 20 "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 21 LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 22 A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT 23 OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 24 SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 25 LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 26 DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 27 THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 28 (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 29 OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 30 31.. _Enabling_Additional_Functionality: 32 33Enabling Additional Functionality 34================================= 35 36.. _High_Precision_Event_Timer: 37 38High Precision Event Timer HPET) Functionality 39---------------------------------------------- 40 41BIOS Support 42~~~~~~~~~~~~ 43 44The High Precision Timer (HPET) must be enabled in the platform BIOS if the HPET is to be used. 45Otherwise, the Time Stamp Counter (TSC) is used by default. 46The BIOS is typically accessed by pressing F2 while the platform is starting up. 47The user can then navigate to the HPET option. On the Crystal Forest platform BIOS, the path is: 48**Advanced -> PCH-IO Configuration -> High Precision Timer ->** (Change from Disabled to Enabled if necessary). 49 50On a system that has already booted, the following command can be issued to check if HPET is enabled: 51 52.. code-block:: console 53 54 # grep hpet /proc/timer_list 55 56If no entries are returned, HPET must be enabled in the BIOS (as per the instructions above) and the system rebooted. 57 58Linux Kernel Support 59~~~~~~~~~~~~~~~~~~~~ 60 61The Intel® DPDK makes use of the platform HPET timer by mapping the timer counter into the process address space, and as such, 62requires that the HPET_MMAP kernel configuration option be enabled. 63 64.. warning:: 65 66 On Fedora*, and other common distributions such as Ubuntu*, the HPET_MMAP kernel option is not enabled by default. 67 To recompile the Linux kernel with this option enabled, please consult the distributions documentation for the relevant instructions. 68 69Enabling HPET in the Intel® DPDK 70~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 71 72By default, HPET support is disabled in the Intel® DPDK build configuration files. 73To use HPET, the CONFIG_RTE_LIBEAL_USE_HPET setting should be changed to “y”, which will enable the HPET settings at compile time. 74 75For an application to use the rte_get_hpet_cycles() and rte_get_hpet_hz() API calls, 76and optionally to make the HPET the default time source for the rte_timer library, 77the new rte_eal_hpet_init() API call should be called at application initialization. 78This API call will ensure that the HPET is accessible, returning an error to the application if it is not, 79for example, if HPET_MMAP is not enabled in the kernel. 80The application can then determine what action to take, if any, if the HPET is not available at run-time. 81 82.. note:: 83 84 For applications that require timing APIs, but not the HPET timer specifically, 85 it is recommended that the rte_get_timer_cycles() and rte_get_timer_hz() API calls be used instead of the HPET-specific APIs. 86 These generic APIs can work with either TSC or HPET time sources, depending on what is requested by an application call to rte_eal_hpet_init(), 87 if any, and on what is available on the system at runtime. 88 89Running Intel® DPDK Applications Without Root Privileges 90-------------------------------------------------------- 91 92Although applications using the Intel® DPDK use network ports and other hardware resources directly, 93with a number of small permission adjustments it is possible to run these applications as a user other than “root”. 94To do so, the ownership, or permissions, on the following Linux file system objects should be adjusted to ensure that 95the Linux user account being used to run the Intel® DPDK application has access to them: 96 97* All directories which serve as hugepage mount points, for example, /mnt/huge 98 99* The userspace-io device files in /dev, for example, /dev/uio0, /dev/uio1, and so on 100 101* If the HPET is to be used, /dev/hpet 102 103.. note:: 104 105 On some Linux installations, /dev/hugepages is also a hugepage mount point created by default. 106 107Power Management and Power Saving Functionality 108----------------------------------------------- 109 110Enhanced Intel SpeedStep® Technology must be enabled in the platform BIOS if the power management feature of Intel® DPDK is to be used. 111Otherwise, the sys file folder /sys/devices/system/cpu/cpu0/cpufreq will not exist, and the CPU frequency- based power management cannot be used. 112Consult the relevant BIOS documentation to determine how these settings can be accessed. 113 114For example, on some Intel reference platform BIOS variants, the path to Enhanced Intel SpeedStep® Technology is: 115 116**Advanced->Processor Configuration->Enhanced Intel SpeedStep® Tech** 117 118In addition, C3 and C6 should be enabled as well for power management. The path of C3 and C6 on the same platform BIOS is: 119 120**Advanced->Processor Configuration->Processor C3 Advanced->Processor Configuration-> Processor C6** 121 122Using Linux* Core Isolation to Reduce Context Switches 123------------------------------------------------------ 124 125While the threads used by an Intel® DPDK application are pinned to logical cores on the system, 126it is possible for the Linux scheduler to run other tasks on those cores also. 127To help prevent additional workloads from running on those cores, 128it is possible to use the isolcpus Linux* kernel parameter to isolate them from the general Linux scheduler. 129 130For example, if Intel® DPDK applications are to run on logical cores 2, 4 and 6, 131the following should be added to the kernel parameter list: 132 133.. code-block:: console 134 135 isolcpus=2,4,6 136 137Loading the Intel® DPDK KNI Kernel Module 138----------------------------------------- 139 140To run the Intel® DPDK Kernel NIC Interface (KNI) sample application, an extra kernel module (the kni module) must be loaded into the running kernel. 141The module is found in the kmod sub-directory of the Intel® DPDK target directory. 142Similar to the loading of the igb_uio module, this module should be loaded using the insmod command as shown below 143(assuming that the current directory is the Intel® DPDK target directory): 144 145.. code-block:: console 146 147 #insmod kmod/rte_kni.ko 148 149.. note:: 150 151 See the “Kernel NIC Interface Sample Application” chapter in the *Intel® DPDK Sample Applications User Guide* for more details. 152 153Using Linux IOMMU Pass-Through to Run Intel® DPDK with Intel® VT-d 154------------------------------------------------------------------ 155 156To enable Intel® VT-d in a Linux kernel, a number of kernel configuration options must be set. These include: 157 158* IOMMU_SUPPORT 159 160* IOMMU_API 161 162* INTEL_IOMMU 163 164In addition, to run the Intel® DPDK with Intel® VT-d, the iommu=pt kernel parameter must be used when using igb_uio driver. 165This results in pass-through of the DMAR (DMA Remapping) lookup in the host. 166Also, if INTEL_IOMMU_DEFAULT_ON is not set in the kernel, the intel_iommu=on kernel parameter must be used too. 167This ensures that the Intel IOMMU is being initialized as expected. 168 169Please note that while using iommu=pt is compulsory for igb_uio driver, the vfio-pci driver can actually work with both iommu=pt and iommu=on. 170 171High Performance of Small Packets on 40G NIC 172-------------------------------------------- 173 174Enabling Extended Tag and Setting Max Read Request Size 175~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 176 177PCI configurations of extended_tag and max _read_requ st_size have big impacts on performance of small packets on 40G NIC. 178Enabling extended_tag and setting max _read_requ st_size to small size such as 128 bytes provide great helps to high performance of small packets. 179 180* These can be done in some BIOS implementations. 181 182* For other BIOS implementations, PCI configurations can be changed by using command of setpci, or special configurations in DPDK config file of common_linux. 183 184 * Bits 7:5 at address of 0xA8 of each PCI device is used for setting the max_read_request_size, 185 and bit 8 of 0xA8 of each PCI device is used for enabling/disabling the extended_tag. 186 lspci and setpci can be used to read the values of 0xA8 and then write it back after being changed. 187 188 * In config file of common_linux, below three configurations can be changed for the same purpose. 189 190 CONFIG_RTE_PCI_CONFIG 191 192 CONFIG_RTE_PCI_EXTENDED_TAG 193 194 CONFIG_RTE_PCI_MAX_READ_REQUEST_SIZE 195 196Use 16 Bytes RX Descriptor Size 197~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 198 199As i40e PMD supports both 16 and 32 bytes RX descriptor sizes, and 16 bytes size can provide helps to high performance of small packets. 200Configuration of CONFIG_RTE_LIBRTE_I40E_16BYTE_RX_DESC in config files can be changed to use 16 bytes size RX descriptors. 201