xref: /dpdk/doc/guides/linux_gsg/enable_func.rst (revision a886540e5ab5b6e342b628180a9d9a5379847087)
1..  SPDX-License-Identifier: BSD-3-Clause
2    Copyright(c) 2010-2014 Intel Corporation.
3
4.. include:: <isonum.txt>
5
6.. _Enabling_Additional_Functionality:
7
8Enabling Additional Functionality
9=================================
10
11.. _Running_Without_Root_Privileges:
12
13Running DPDK Applications Without Root Privileges
14-------------------------------------------------
15
16The following sections describe generic requirements and configuration
17for running DPDK applications as non-root.
18There may be additional requirements documented for some drivers.
19
20Hugepages
21~~~~~~~~~
22
23Hugepages must be reserved as root before running the application as non-root,
24for example::
25
26  sudo dpdk-hugepages.py --reserve 1G
27
28If multi-process is not required, running with ``--in-memory``
29bypasses the need to access hugepage mount point and files within it.
30Otherwise, hugepage directory must be made accessible
31for writing to the unprivileged user.
32A good way for managing multiple applications using hugepages
33is to mount the filesystem with group permissions
34and add a supplementary group to each application or container.
35
36One option is to use the script provided by this project::
37
38  export HUGEDIR=$HOME/huge-1G
39  mkdir -p $HUGEDIR
40  sudo dpdk-hugepages.py --mount --directory $HUGEDIR --user `id -u` --group `id -g`
41
42In production environment, the OS can manage mount points
43(`systemd example <https://github.com/systemd/systemd/blob/main/units/dev-hugepages.mount>`_).
44
45The ``hugetlb`` filesystem has additional options to guarantee or limit
46the amount of memory that is possible to allocate using the mount point.
47Refer to the `documentation <https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt>`_.
48
49.. note::
50
51   Using ``vfio-pci`` kernel driver, if applicable, can eliminate the need
52   for physical addresses and therefore eliminate the permission requirements
53   described below.
54
55If the driver requires using physical addresses (PA),
56the executable file must be granted additional capabilities:
57
58* ``DAC_READ_SEARCH`` and ``SYS_ADMIN`` to read ``/proc/self/pagemaps``
59* ``IPC_LOCK`` to lock hugepages in memory
60
61.. code-block:: console
62
63   setcap cap_dac_read_search,cap_ipc_lock,cap_sys_admin+ep <executable>
64
65If physical addresses are not accessible,
66the following message will appear during EAL initialization::
67
68  EAL: rte_mem_virt2phy(): cannot open /proc/self/pagemap: Permission denied
69
70It is harmless in case PA are not needed.
71
72Resource Limits
73~~~~~~~~~~~~~~~
74
75When running as non-root user, there may be some additional resource limits
76that are imposed by the system. Specifically, the following resource limits may
77need to be adjusted in order to ensure normal DPDK operation:
78
79* RLIMIT_LOCKS (number of file locks that can be held by a process)
80
81* RLIMIT_NOFILE (number of open file descriptors that can be held open by a process)
82
83* RLIMIT_MEMLOCK (amount of pinned pages the process is allowed to have)
84
85The above limits can usually be adjusted by editing
86``/etc/security/limits.conf`` file, and rebooting.
87
88See :ref:`Hugepage Mapping <hugepage_mapping>` section to learn how these limits affect EAL.
89
90Device Control
91~~~~~~~~~~~~~~
92
93If the HPET is to be used, ``/dev/hpet`` permissions must be adjusted.
94
95For ``vfio-pci`` kernel driver, the following Linux file system objects'
96permissions should be adjusted:
97
98* The VFIO device file, ``/dev/vfio/vfio``
99
100* The directories under ``/dev/vfio`` that correspond to IOMMU group numbers of
101  devices intended to be used by DPDK, for example, ``/dev/vfio/50``
102
103Power Management and Power Saving Functionality
104-----------------------------------------------
105
106Enhanced Intel SpeedStep\ |reg| Technology must be enabled in the platform BIOS if the power management feature of DPDK is to be used.
107Otherwise, the sys file folder ``/sys/devices/system/cpu/cpu0/cpufreq`` will not exist, and the CPU frequency- based power management cannot be used.
108Consult the relevant BIOS documentation to determine how these settings can be accessed.
109
110For example, on some Intel reference platform BIOS variants, the path to Enhanced Intel SpeedStep\ |reg| Technology is::
111
112   Advanced
113     -> Processor Configuration
114     -> Enhanced Intel SpeedStep\ |reg| Tech
115
116In addition, C3 and C6 should be enabled as well for power management. The path of C3 and C6 on the same platform BIOS is::
117
118   Advanced
119     -> Processor Configuration
120     -> Processor C3 Advanced
121     -> Processor Configuration
122     -> Processor C6
123
124Using Linux Core Isolation to Reduce Context Switches
125-----------------------------------------------------
126
127While the threads used by a DPDK application are pinned to logical cores on the system,
128it is possible for the Linux scheduler to run other tasks on those cores.
129To help prevent additional workloads, timers, RCU processing and IRQs
130from running on those cores, it is possible to use
131the Linux kernel parameters ``isolcpus``, ``nohz_full``, ``irqaffinity``
132to isolate them from the general Linux scheduler tasks.
133
134For example, if a given CPU has 0-7 cores
135and DPDK applications are to run on logical cores 2, 4 and 6,
136the following should be added to the kernel parameter list:
137
138.. code-block:: console
139
140   isolcpus=2,4,6 nohz_full=2,4,6 irqaffinity=0,1,3,5,7
141
142.. note::
143
144   More detailed information about the above parameters can be found at
145   `NO_HZ <https://www.kernel.org/doc/html/latest/timers/no_hz.html>`_,
146   `IRQ <https://www.kernel.org/doc/html/latest/core-api/irq/>`_,
147   and `kernel parameters
148   <https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html>`_
149
150For more fine grained control over resource management and performance tuning
151one can look into "Linux cgroups",
152`cpusets <https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v1/cpusets.html>`_,
153`cpuset man pages <https://man7.org/linux/man-pages/man7/cpuset.7.html>`_, and
154`systemd CPU affinity <https://www.freedesktop.org/software/systemd/man/systemd.exec.html>`_.
155
156Also see
157`CPU isolation example <https://www.suse.com/c/cpu-isolation-practical-example-part-5/>`_
158and `systemd core isolation example <https://www.rcannings.com/systemd-core-isolation/>`_.
159
160.. _High_Precision_Event_Timer:
161
162High Precision Event Timer (HPET) Functionality
163-----------------------------------------------
164
165DPDK can support the system HPET as a timer source rather than the system default timers,
166such as the core Time-Stamp Counter (TSC) on x86 systems.
167To enable HPET support in DPDK:
168
169#. Ensure that HPET is enabled in BIOS settings.
170#. Enable ``HPET_MMAP`` support in kernel configuration.
171   Note that this my involve doing a kernel rebuild,
172   as many common linux distributions do *not* have this setting
173   enabled by default in their kernel builds.
174#. Enable DPDK support for HPET by using the build-time meson option ``use_hpet``,
175   for example, ``meson configure -Duse_hpet=true``
176
177For an application to use the ``rte_get_hpet_cycles()`` and ``rte_get_hpet_hz()`` API calls,
178and optionally to make the HPET the default time source for the rte_timer library,
179the ``rte_eal_hpet_init()`` API call should be called at application initialization.
180This API call will ensure that the HPET is accessible,
181returning an error to the application if it is not.
182
183For applications that require timing APIs, but not the HPET timer specifically,
184it is recommended that the ``rte_get_timer_cycles()`` and ``rte_get_timer_hz()``
185API calls be used instead of the HPET-specific APIs.
186These generic APIs can work with either TSC or HPET time sources,
187depending on what is requested by an application call to ``rte_eal_hpet_init()``,
188if any, and on what is available on the system at runtime.
189