1.. SPDX-License-Identifier: BSD-3-Clause 2 Copyright(c) 2010-2014 Intel Corporation. 3 4.. include:: <isonum.txt> 5 6.. _Enabling_Additional_Functionality: 7 8Enabling Additional Functionality 9================================= 10 11.. _Running_Without_Root_Privileges: 12 13Running DPDK Applications Without Root Privileges 14------------------------------------------------- 15 16The following sections describe generic requirements and configuration 17for running DPDK applications as non-root. 18There may be additional requirements documented for some drivers. 19 20Hugepages 21~~~~~~~~~ 22 23Hugepages must be reserved as root before running the application as non-root, 24for example:: 25 26 sudo dpdk-hugepages.py --reserve 1G 27 28If multi-process is not required, running with ``--in-memory`` 29bypasses the need to access hugepage mount point and files within it. 30Otherwise, hugepage directory must be made accessible 31for writing to the unprivileged user. 32A good way for managing multiple applications using hugepages 33is to mount the filesystem with group permissions 34and add a supplementary group to each application or container. 35 36One option is to use the script provided by this project:: 37 38 export HUGEDIR=$HOME/huge-1G 39 mkdir -p $HUGEDIR 40 sudo dpdk-hugepages.py --mount --directory $HUGEDIR --user `id -u` --group `id -g` 41 42In production environment, the OS can manage mount points 43(`systemd example <https://github.com/systemd/systemd/blob/main/units/dev-hugepages.mount>`_). 44 45The ``hugetlb`` filesystem has additional options to guarantee or limit 46the amount of memory that is possible to allocate using the mount point. 47Refer to the `documentation <https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt>`_. 48 49.. note:: 50 51 Using ``vfio-pci`` kernel driver, if applicable, can eliminate the need 52 for physical addresses and therefore eliminate the permission requirements 53 described below. 54 55If the driver requires using physical addresses (PA), 56the executable file must be granted additional capabilities: 57 58* ``DAC_READ_SEARCH`` and ``SYS_ADMIN`` to read ``/proc/self/pagemaps`` 59* ``IPC_LOCK`` to lock hugepages in memory 60 61.. code-block:: console 62 63 setcap cap_dac_read_search,cap_ipc_lock,cap_sys_admin+ep <executable> 64 65If physical addresses are not accessible, 66the following message will appear during EAL initialization:: 67 68 EAL: rte_mem_virt2phy(): cannot open /proc/self/pagemap: Permission denied 69 70It is harmless in case PA are not needed. 71 72Resource Limits 73~~~~~~~~~~~~~~~ 74 75When running as non-root user, there may be some additional resource limits 76that are imposed by the system. Specifically, the following resource limits may 77need to be adjusted in order to ensure normal DPDK operation: 78 79* RLIMIT_LOCKS (number of file locks that can be held by a process) 80 81* RLIMIT_NOFILE (number of open file descriptors that can be held open by a process) 82 83* RLIMIT_MEMLOCK (amount of pinned pages the process is allowed to have) 84 85The above limits can usually be adjusted by editing 86``/etc/security/limits.conf`` file, and rebooting. 87 88See :ref:`Hugepage Mapping <hugepage_mapping>` section to learn how these limits affect EAL. 89 90Device Control 91~~~~~~~~~~~~~~ 92 93If the HPET is to be used, ``/dev/hpet`` permissions must be adjusted. 94 95For ``vfio-pci`` kernel driver, the following Linux file system objects' 96permissions should be adjusted: 97 98* The VFIO device file, ``/dev/vfio/vfio`` 99 100* The directories under ``/dev/vfio`` that correspond to IOMMU group numbers of 101 devices intended to be used by DPDK, for example, ``/dev/vfio/50`` 102 103Power Management and Power Saving Functionality 104----------------------------------------------- 105 106Enhanced Intel SpeedStep\ |reg| Technology must be enabled in the platform BIOS if the power management feature of DPDK is to be used. 107Otherwise, the sys file folder ``/sys/devices/system/cpu/cpu0/cpufreq`` will not exist, and the CPU frequency- based power management cannot be used. 108Consult the relevant BIOS documentation to determine how these settings can be accessed. 109 110For example, on some Intel reference platform BIOS variants, the path to Enhanced Intel SpeedStep\ |reg| Technology is:: 111 112 Advanced 113 -> Processor Configuration 114 -> Enhanced Intel SpeedStep\ |reg| Tech 115 116In addition, C3 and C6 should be enabled as well for power management. The path of C3 and C6 on the same platform BIOS is:: 117 118 Advanced 119 -> Processor Configuration 120 -> Processor C3 Advanced 121 -> Processor Configuration 122 -> Processor C6 123 124Using Linux Core Isolation to Reduce Context Switches 125----------------------------------------------------- 126 127While the threads used by a DPDK application are pinned to logical cores on the system, 128it is possible for the Linux scheduler to run other tasks on those cores. 129To help prevent additional workloads, timers, RCU processing and IRQs 130from running on those cores, it is possible to use 131the Linux kernel parameters ``isolcpus``, ``nohz_full``, ``irqaffinity`` 132to isolate them from the general Linux scheduler tasks. 133 134For example, if a given CPU has 0-7 cores 135and DPDK applications are to run on logical cores 2, 4 and 6, 136the following should be added to the kernel parameter list: 137 138.. code-block:: console 139 140 isolcpus=2,4,6 nohz_full=2,4,6 irqaffinity=0,1,3,5,7 141 142.. note:: 143 144 More detailed information about the above parameters can be found at 145 `NO_HZ <https://www.kernel.org/doc/html/latest/timers/no_hz.html>`_, 146 `IRQ <https://www.kernel.org/doc/html/latest/core-api/irq/>`_, 147 and `kernel parameters 148 <https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html>`_ 149 150For more fine grained control over resource management and performance tuning 151one can look into "Linux cgroups", 152`cpusets <https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v1/cpusets.html>`_, 153`cpuset man pages <https://man7.org/linux/man-pages/man7/cpuset.7.html>`_, and 154`systemd CPU affinity <https://www.freedesktop.org/software/systemd/man/systemd.exec.html>`_. 155 156Also see 157`CPU isolation example <https://www.suse.com/c/cpu-isolation-practical-example-part-5/>`_ 158and `systemd core isolation example <https://www.rcannings.com/systemd-core-isolation/>`_. 159 160.. _High_Precision_Event_Timer: 161 162High Precision Event Timer (HPET) Functionality 163----------------------------------------------- 164 165DPDK can support the system HPET as a timer source rather than the system default timers, 166such as the core Time-Stamp Counter (TSC) on x86 systems. 167To enable HPET support in DPDK: 168 169#. Ensure that HPET is enabled in BIOS settings. 170#. Enable ``HPET_MMAP`` support in kernel configuration. 171 Note that this my involve doing a kernel rebuild, 172 as many common linux distributions do *not* have this setting 173 enabled by default in their kernel builds. 174#. Enable DPDK support for HPET by using the build-time meson option ``use_hpet``, 175 for example, ``meson configure -Duse_hpet=true`` 176 177For an application to use the ``rte_get_hpet_cycles()`` and ``rte_get_hpet_hz()`` API calls, 178and optionally to make the HPET the default time source for the rte_timer library, 179the ``rte_eal_hpet_init()`` API call should be called at application initialization. 180This API call will ensure that the HPET is accessible, 181returning an error to the application if it is not. 182 183For applications that require timing APIs, but not the HPET timer specifically, 184it is recommended that the ``rte_get_timer_cycles()`` and ``rte_get_timer_hz()`` 185API calls be used instead of the HPET-specific APIs. 186These generic APIs can work with either TSC or HPET time sources, 187depending on what is requested by an application call to ``rte_eal_hpet_init()``, 188if any, and on what is available on the system at runtime. 189