xref: /dpdk/doc/guides/sample_app_ug/l3_forward_power_man.rst (revision 8750576fb2a9a067ffbcce4bab6481f3bfa47097)
15630257fSFerruh Yigit..  SPDX-License-Identifier: BSD-3-Clause
25630257fSFerruh Yigit    Copyright(c) 2010-2014 Intel Corporation.
3d0dff9baSBernard Iremonger
4d0dff9baSBernard IremongerL3 Forwarding with Power Management Sample Application
5d0dff9baSBernard Iremonger======================================================
6d0dff9baSBernard Iremonger
7d0dff9baSBernard IremongerIntroduction
8d0dff9baSBernard Iremonger------------
9d0dff9baSBernard Iremonger
10*8750576fSNandini PersadThe L3 forwarding with Power Management application is an example
11*8750576fSNandini Persadof power-aware packet processing using the DPDK.
12*8750576fSNandini PersadThe application is based on the existing L3 forwarding sample application,
13d0dff9baSBernard Iremongerwith the power management algorithms to control the P-states and
14d0dff9baSBernard IremongerC-states of the Intel processor via a power management library.
15d0dff9baSBernard Iremonger
16d0dff9baSBernard IremongerOverview
17d0dff9baSBernard Iremonger--------
18d0dff9baSBernard Iremonger
19e0c7c473SSiobhan ButlerThe application demonstrates the use of the Power libraries in the DPDK to implement packet forwarding.
20513b0723SMauricio Vasquez BThe initialization and run-time paths are very similar to those of the :doc:`l3_forward`.
21d0dff9baSBernard IremongerThe main difference from the L3 Forwarding sample application is that this application introduces power-aware optimization algorithms
22d0dff9baSBernard Iremongerby leveraging the Power library to control P-state and C-state of processor based on packet load.
23d0dff9baSBernard Iremonger
24e0c7c473SSiobhan ButlerThe DPDK includes poll-mode drivers to configure Intel NIC devices and their receive (Rx) and transmit (Tx) queues.
25d0dff9baSBernard IremongerThe design principle of this PMD is to access the Rx and Tx descriptors directly without any interrupts to quickly receive,
26d0dff9baSBernard Iremongerprocess and deliver packets in the user space.
27d0dff9baSBernard Iremonger
28e0c7c473SSiobhan ButlerIn general, the DPDK executes an endless packet processing loop on dedicated IA cores that include the following steps:
29d0dff9baSBernard Iremonger
30d0dff9baSBernard Iremonger*   Retrieve input packets through the PMD to poll Rx queue
31d0dff9baSBernard Iremonger
32d0dff9baSBernard Iremonger*   Process each received packet or provide received packets to other processing cores through software queues
33d0dff9baSBernard Iremonger
34d0dff9baSBernard Iremonger*   Send pending output packets to Tx queue through the PMD
35d0dff9baSBernard Iremonger
36d0dff9baSBernard IremongerIn this way, the PMD achieves better performance than a traditional interrupt-mode driver,
37d0dff9baSBernard Iremongerat the cost of keeping cores active and running at the highest frequency,
38d0dff9baSBernard Iremongerhence consuming the maximum power all the time.
39d0dff9baSBernard IremongerHowever, during the period of processing light network traffic,
40d0dff9baSBernard Iremongerwhich happens regularly in communication infrastructure systems due to well-known "tidal effect",
41d0dff9baSBernard Iremongerthe PMD is still busy waiting for network packets, which wastes a lot of power.
42d0dff9baSBernard Iremonger
43d0dff9baSBernard IremongerProcessor performance states (P-states) are the capability of an Intel processor
44d0dff9baSBernard Iremongerto switch between different supported operating frequencies and voltages.
45d0dff9baSBernard IremongerIf configured correctly, according to system workload, this feature provides power savings.
46d0dff9baSBernard IremongerCPUFreq is the infrastructure provided by the Linux* kernel to control the processor performance state capability.
47d0dff9baSBernard IremongerCPUFreq supports a user space governor that enables setting frequency via manipulating the virtual file device from a user space application.
48e0c7c473SSiobhan ButlerThe Power library in the DPDK provides a set of APIs for manipulating a virtual file device to allow user space application
49d0dff9baSBernard Iremongerto set the CPUFreq governor and set the frequency of specific cores.
50d0dff9baSBernard Iremonger
51d0dff9baSBernard IremongerThis application includes a P-state power management algorithm to generate a frequency hint to be sent to CPUFreq.
52d0dff9baSBernard IremongerThe algorithm uses the number of received and available Rx packets on recent polls to make a heuristic decision to scale frequency up/down.
53c053d9e9SSarosh ArifSpecifically, some thresholds are checked to see whether a specific core running a DPDK polling thread needs to increase frequency
54d0dff9baSBernard Iremongera step up based on the near to full trend of polled Rx queues.
55d0dff9baSBernard IremongerAlso, it decreases frequency a step if packet processed per loop is far less than the expected threshold
56d0dff9baSBernard Iremongeror the thread's sleeping time exceeds a threshold.
57d0dff9baSBernard Iremonger
58d0dff9baSBernard IremongerC-States are also known as sleep states.
59d0dff9baSBernard IremongerThey allow software to put an Intel core into a low power idle state from which it is possible to exit via an event, such as an interrupt.
60d0dff9baSBernard IremongerHowever, there is a tradeoff between the power consumed in the idle state and the time required to wake up from the idle state (exit latency).
61d0dff9baSBernard IremongerTherefore, as you go into deeper C-states, the power consumed is lower but the exit latency is increased. Each C-state has a target residency.
62d0dff9baSBernard IremongerIt is essential that when entering into a C-state, the core remains in this C-state for at least as long as the target residency in order
63d0dff9baSBernard Iremongerto fully realize the benefits of entering the C-state.
64d0dff9baSBernard IremongerCPUIdle is the infrastructure provide by the Linux kernel to control the processor C-state capability.
65d0dff9baSBernard IremongerUnlike CPUFreq, CPUIdle does not provide a mechanism that allows the application to change C-state.
66d0dff9baSBernard IremongerIt actually has its own heuristic algorithms in kernel space to select target C-state to enter by executing privileged instructions like HLT and MWAIT,
67d0dff9baSBernard Iremongerbased on the speculative sleep duration of the core.
68d0dff9baSBernard IremongerIn this application, we introduce a heuristic algorithm that allows packet processing cores to sleep for a short period
69d0dff9baSBernard Iremongerif there is no Rx packet received on recent polls.
70d0dff9baSBernard IremongerIn this way, CPUIdle automatically forces the corresponding cores to enter deeper C-states
71d0dff9baSBernard Iremongerinstead of always running to the C0 state waiting for packets.
724d23d39fSHuisong LiBut user can set the CPU resume latency to control C-state selection.
734d23d39fSHuisong LiSetting the CPU resume latency to 0
744d23d39fSHuisong Lican limit the CPU just to enter C0-state to improve performance,
754d23d39fSHuisong Liwhich may increase power consumption of platform.
76d0dff9baSBernard Iremonger
77d0dff9baSBernard Iremonger.. note::
78d0dff9baSBernard Iremonger
79d0dff9baSBernard Iremonger    To fully demonstrate the power saving capability of using C-states,
80d0dff9baSBernard Iremonger    it is recommended to enable deeper C3 and C6 states in the BIOS during system boot up.
81d0dff9baSBernard Iremonger
82d0dff9baSBernard IremongerCompiling the Application
83d0dff9baSBernard Iremonger-------------------------
84d0dff9baSBernard Iremonger
85*8750576fSNandini PersadTo compile the sample application, see :doc:`compiling`.
86d0dff9baSBernard Iremonger
877cacb056SHerakliusz LipiecThe application is located in the ``l3fwd-power`` sub-directory.
88d0dff9baSBernard Iremonger
89d0dff9baSBernard IremongerRunning the Application
90d0dff9baSBernard Iremonger-----------------------
91d0dff9baSBernard Iremonger
92d0dff9baSBernard IremongerThe application has a number of command line options:
93d0dff9baSBernard Iremonger
94d0dff9baSBernard Iremonger.. code-block:: console
95d0dff9baSBernard Iremonger
961bb4a528SFerruh Yigit    ./<build_dir>/examples/dpdk-l3fwd_power [EAL options] -- -p PORTMASK [-P]  --config(port,queue,lcore)[,(port,queue,lcore)] [--max-pkt-len PKTLEN] [--no-numa]
97d0dff9baSBernard Iremonger
98d0dff9baSBernard Iremongerwhere,
99d0dff9baSBernard Iremonger
100d0dff9baSBernard Iremonger*   -p PORTMASK: Hexadecimal bitmask of ports to configure
101d0dff9baSBernard Iremonger
102d0dff9baSBernard Iremonger*   -P: Sets all ports to promiscuous mode so that packets are accepted regardless of the packet's Ethernet MAC destination address.
103d0dff9baSBernard Iremonger    Without this option, only packets with the Ethernet MAC destination address set to the Ethernet address of the port are accepted.
104d0dff9baSBernard Iremonger
10510db2a5bSTadhg Kearney*   -u: optional, sets uncore min/max frequency to minimum value.
10610db2a5bSTadhg Kearney
10710db2a5bSTadhg Kearney*   -U: optional, sets uncore min/max frequency to maximum value.
10810db2a5bSTadhg Kearney
10910db2a5bSTadhg Kearney*   -i (frequency index): optional, sets uncore frequency to frequency index value, by setting min and max values to be the same.
11010db2a5bSTadhg Kearney
111d0dff9baSBernard Iremonger*   --config (port,queue,lcore)[,(port,queue,lcore)]: determines which queues from which ports are mapped to which cores.
112d0dff9baSBernard Iremonger
1134d23d39fSHuisong Li*   --cpu-resume-latency LATENCY: set CPU resume latency to control C-state selection, 0 : just allow to enter C0-state.
1144d23d39fSHuisong Li
115d0dff9baSBernard Iremonger*   --max-pkt-len: optional, maximum packet length in decimal (64-9600)
116d0dff9baSBernard Iremonger
117d0dff9baSBernard Iremonger*   --no-numa: optional, disables numa awareness
118d0dff9baSBernard Iremonger
119609e7984SReshma Pattan*   --telemetry:  Telemetry mode.
120609e7984SReshma Pattan
12126fe454eSLiang Ma*   --pmd-mgmt: PMD power management mode.
12226fe454eSLiang Ma
12359f2853cSKevin Laatz*   --max-empty-polls : Number of empty polls to wait before entering sleep state. Applies to --pmd-mgmt mode only.
12459f2853cSKevin Laatz
12559f2853cSKevin Laatz*   --pause-duration: Set the duration of the pause callback (microseconds). Applies to --pmd-mgmt mode only.
12659f2853cSKevin Laatz
12759f2853cSKevin Laatz*   --scale-freq-min: Set minimum frequency for scaling. Applies to --pmd-mgmt mode only.
12859f2853cSKevin Laatz
12959f2853cSKevin Laatz*   --scale-freq-max: Set maximum frequency for scaling. Applies to --pmd-mgmt mode only.
13059f2853cSKevin Laatz
131513b0723SMauricio Vasquez BSee :doc:`l3_forward` for details.
132d0dff9baSBernard IremongerThe L3fwd-power example reuses the L3fwd command line options.
133d0dff9baSBernard Iremonger
134d0dff9baSBernard IremongerExplanation
135d0dff9baSBernard Iremonger-----------
136d0dff9baSBernard Iremonger
137*8750576fSNandini PersadThe following sections provide explanation of the sample application code.
138d0dff9baSBernard IremongerAs mentioned in the overview section,
139d0dff9baSBernard Iremongerthe initialization and run-time paths are identical to those of the L3 forwarding application.
140d0dff9baSBernard IremongerThe following sections describe aspects that are specific to the L3 Forwarding with Power Management sample application.
141d0dff9baSBernard Iremonger
142d0dff9baSBernard IremongerPower Library Initialization
143d0dff9baSBernard Iremonger~~~~~~~~~~~~~~~~~~~~~~~~~~~~
144d0dff9baSBernard Iremonger
145d737e954SThomas MonjalonThe Power library is initialized in the main routine.
146d0dff9baSBernard IremongerIt changes the P-state governor to userspace for specific cores that are under control.
147d0dff9baSBernard IremongerThe Timer library is also initialized and several timers are created later on,
148d0dff9baSBernard Iremongerresponsible for checking if it needs to scale down frequency at run time by checking CPU utilization statistics.
149d0dff9baSBernard Iremonger
150d0dff9baSBernard Iremonger.. note::
151d0dff9baSBernard Iremonger
152d0dff9baSBernard Iremonger    Only the power management related initialization is shown.
153d0dff9baSBernard Iremonger
1549a212dc0SConor Fogarty.. literalinclude:: ../../../examples/l3fwd-power/main.c
1559a212dc0SConor Fogarty    :language: c
1569a212dc0SConor Fogarty    :start-after: Power library initialized in the main routine. 8<
1579a212dc0SConor Fogarty    :end-before: >8 End of power library initialization.
158d0dff9baSBernard Iremonger
159d0dff9baSBernard IremongerMonitoring Loads of Rx Queues
160d0dff9baSBernard Iremonger~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
161d0dff9baSBernard Iremonger
162e0c7c473SSiobhan ButlerIn general, the polling nature of the DPDK prevents the OS power management subsystem from knowing
163d0dff9baSBernard Iremongerif the network load is actually heavy or light.
164d0dff9baSBernard IremongerIn this sample, sampling network load work is done by monitoring received and
165d0dff9baSBernard Iremongeravailable descriptors on NIC Rx queues in recent polls.
166d0dff9baSBernard IremongerBased on the number of returned and available Rx descriptors,
167d0dff9baSBernard Iremongerthis example implements algorithms to generate frequency scaling hints and speculative sleep duration,
168d0dff9baSBernard Iremongerand use them to control P-state and C-state of processors via the power management library.
169d0dff9baSBernard IremongerFrequency (P-state) control and sleep state (C-state) control work individually for each logical core,
170d0dff9baSBernard Iremongerand the combination of them contributes to a power efficient packet processing solution when serving light network loads.
171d0dff9baSBernard Iremonger
172d0dff9baSBernard IremongerThe rte_eth_rx_burst() function and the newly-added rte_eth_rx_queue_count() function are used in the endless packet processing loop
173d0dff9baSBernard Iremongerto return the number of received and available Rx descriptors.
174d0dff9baSBernard IremongerAnd those numbers of specific queue are passed to P-state and C-state heuristic algorithms
175d0dff9baSBernard Iremongerto generate hints based on recent network load trends.
176d0dff9baSBernard Iremonger
177d0dff9baSBernard Iremonger.. note::
178d0dff9baSBernard Iremonger
179d0dff9baSBernard Iremonger    Only power control related code is shown.
180d0dff9baSBernard Iremonger
1819a212dc0SConor Fogarty.. literalinclude:: ../../../examples/l3fwd-power/main.c
1829a212dc0SConor Fogarty    :language: c
1839a212dc0SConor Fogarty    :start-after: Main processing loop. 8<
1849a212dc0SConor Fogarty    :end-before: >8 End of main processing loop.
185d0dff9baSBernard Iremonger
186d0dff9baSBernard IremongerP-State Heuristic Algorithm
187d0dff9baSBernard Iremonger~~~~~~~~~~~~~~~~~~~~~~~~~~~
188d0dff9baSBernard Iremonger
189d0dff9baSBernard IremongerThe power_freq_scaleup_heuristic() function is responsible for generating a frequency hint for the specified logical core
190d0dff9baSBernard Iremongeraccording to available descriptor number returned from rte_eth_rx_queue_count().
191d0dff9baSBernard IremongerOn every poll for new packets, the length of available descriptor on an Rx queue is evaluated,
192d0dff9baSBernard Iremongerand the algorithm used for frequency hinting is as follows:
193d0dff9baSBernard Iremonger
194d0dff9baSBernard Iremonger*   If the size of available descriptors exceeds 96, the maximum frequency is hinted.
195d0dff9baSBernard Iremonger
196d0dff9baSBernard Iremonger*   If the size of available descriptors exceeds 64, a trend counter is incremented by 100.
197d0dff9baSBernard Iremonger
198d0dff9baSBernard Iremonger*   If the length of the ring exceeds 32, the trend counter is incremented by 1.
199d0dff9baSBernard Iremonger
200d0dff9baSBernard Iremonger*   When the trend counter reached 10000 the frequency hint is changed to the next higher frequency.
201d0dff9baSBernard Iremonger
202d0dff9baSBernard Iremonger.. note::
203d0dff9baSBernard Iremonger
204d0dff9baSBernard Iremonger    The assumption is that the Rx queue size is 128 and the thresholds specified above
205d0dff9baSBernard Iremonger    must be adjusted accordingly based on actual hardware Rx queue size,
206d0dff9baSBernard Iremonger    which are configured via the rte_eth_rx_queue_setup() function.
207d0dff9baSBernard Iremonger
208d0dff9baSBernard IremongerIn general, a thread needs to poll packets from multiple Rx queues.
209d0dff9baSBernard IremongerMost likely, different queue have different load, so they would return different frequency hints.
210d0dff9baSBernard IremongerThe algorithm evaluates all the hints and then scales up frequency in an aggressive manner
211d0dff9baSBernard Iremongerby scaling up to highest frequency as long as one Rx queue requires.
212d0dff9baSBernard IremongerIn this way, we can minimize any negative performance impact.
213d0dff9baSBernard Iremonger
214d0dff9baSBernard IremongerOn the other hand, frequency scaling down is controlled in the timer callback function.
215d0dff9baSBernard IremongerSpecifically, if the sleep times of a logical core indicate that it is sleeping more than 25% of the sampling period,
216d0dff9baSBernard Iremongeror if the average packet per iteration is less than expectation, the frequency is decreased by one step.
217d0dff9baSBernard Iremonger
218d0dff9baSBernard IremongerC-State Heuristic Algorithm
219d0dff9baSBernard Iremonger~~~~~~~~~~~~~~~~~~~~~~~~~~~
220d0dff9baSBernard Iremonger
221d0dff9baSBernard IremongerWhenever recent rte_eth_rx_burst() polls return 5 consecutive zero packets,
222d0dff9baSBernard Iremongeran idle counter begins incrementing for each successive zero poll.
223d0dff9baSBernard IremongerAt the same time, the function power_idle_heuristic() is called to generate speculative sleep duration
224d0dff9baSBernard Iremongerin order to force logical to enter deeper sleeping C-state.
225d0dff9baSBernard IremongerThere is no way to control C- state directly, and the CPUIdle subsystem in OS is intelligent enough
226d0dff9baSBernard Iremongerto select C-state to enter based on actual sleep period time of giving logical core.
227d0dff9baSBernard IremongerThe algorithm has the following sleeping behavior depending on the idle counter:
228d0dff9baSBernard Iremonger
229d0dff9baSBernard Iremonger*   If idle count less than 100, the counter value is used as a microsecond sleep value through rte_delay_us()
230d0dff9baSBernard Iremonger    which execute pause instructions to avoid costly context switch but saving power at the same time.
231d0dff9baSBernard Iremonger
232d0dff9baSBernard Iremonger*   If idle count is between 100 and 999, a fixed sleep interval of 100 μs is used.
233d0dff9baSBernard Iremonger    A 100 μs sleep interval allows the core to enter the C1 state while keeping a fast response time in case new traffic arrives.
234d0dff9baSBernard Iremonger
235d0dff9baSBernard Iremonger*   If idle count is greater than 1000, a fixed sleep value of 1 ms is used until the next timer expiration is used.
236d0dff9baSBernard Iremonger    This allows the core to enter the C3/C6 states.
237d0dff9baSBernard Iremonger
238d0dff9baSBernard Iremonger.. note::
239d0dff9baSBernard Iremonger
240d0dff9baSBernard Iremonger    The thresholds specified above need to be adjusted for different Intel processors and traffic profiles.
241d0dff9baSBernard Iremonger
242d0dff9baSBernard IremongerIf a thread polls multiple Rx queues and different queue returns different sleep duration values,
243d0dff9baSBernard Iremongerthe algorithm controls the sleep time in a conservative manner by sleeping for the least possible time
244d0dff9baSBernard Iremongerin order to avoid a potential performance impact.
245a137d012SLiang Ma
246609e7984SReshma PattanTelemetry Mode
247609e7984SReshma Pattan--------------
248609e7984SReshma Pattan
249*8750576fSNandini PersadThe telemetry mode support for ``l3fwd-power`` is a standalone mode. In this mode,
250609e7984SReshma Pattan``l3fwd-power`` does simple l3fwding along with calculating empty polls, full polls,
251609e7984SReshma Pattanand busy percentage for each forwarding core. The aggregation of these
252609e7984SReshma Pattanvalues of all cores is reported as application level telemetry to metric
253cb056611SStephen Hemmingerlibrary for every 500ms from the main core.
254609e7984SReshma Pattan
255609e7984SReshma PattanThe busy percentage is calculated by recording the poll_count
256609e7984SReshma Pattanand when the count reaches a defined value the total
257609e7984SReshma Pattancycles it took is measured and compared with minimum and maximum
258609e7984SReshma Pattanreference cycles and accordingly busy rate is set  to either 0% or
259609e7984SReshma Pattan50% or 100%.
260609e7984SReshma Pattan
261609e7984SReshma Pattan.. code-block:: console
262609e7984SReshma Pattan
263e2a94f9aSCiara Power        ./<build_dir>/examples/dpdk-l3fwd-power --telemetry -l 1-3 -- -p 0x0f --config="(0,0,2),(0,1,3)" --telemetry
264609e7984SReshma Pattan
265609e7984SReshma PattanThe new stats ``empty_poll`` , ``full_poll`` and ``busy_percent`` can be viewed by running the script
266609e7984SReshma Pattan``/usertools/dpdk-telemetry-client.py`` and selecting the menu option ``Send for global Metrics``.
26726fe454eSLiang Ma
26826fe454eSLiang MaPMD power management Mode
26926fe454eSLiang Ma-------------------------
27026fe454eSLiang Ma
27126fe454eSLiang MaThe PMD power management mode support for ``l3fwd-power`` is a standalone mode.
27226fe454eSLiang MaIn this mode, ``l3fwd-power`` does simple l3fwding
27326fe454eSLiang Maalong with enabling the power saving scheme on specific port/queue/lcore.
27426fe454eSLiang MaMain purpose for this mode is to demonstrate
27526fe454eSLiang Mahow to use the PMD power management API.
27626fe454eSLiang Ma
27726fe454eSLiang Ma.. code-block:: console
27826fe454eSLiang Ma
27926fe454eSLiang Ma        ./build/examples/dpdk-l3fwd-power -l 1-3 --  --pmd-mgmt -p 0x0f --config="(0,0,2),(0,1,3)"
28026fe454eSLiang Ma
28126fe454eSLiang MaPMD Power Management Mode
28226fe454eSLiang Ma-------------------------
28326fe454eSLiang Ma
28426fe454eSLiang MaThere is also a traffic-aware operating mode that,
28526fe454eSLiang Mainstead of using explicit power management,
28626fe454eSLiang Mawill use automatic PMD power management.
28726fe454eSLiang MaThis mode is limited to one queue per core,
28826fe454eSLiang Maand has three available power management schemes:
28926fe454eSLiang Ma
29040b46770SKaren Kelly``baseline``
29140b46770SKaren Kelly  This mode will not enable any power saving features.
29240b46770SKaren Kelly
29326fe454eSLiang Ma``monitor``
29426fe454eSLiang Ma  This will use ``rte_power_monitor()`` function to enter
29526fe454eSLiang Ma  a power-optimized state (subject to platform support).
29626fe454eSLiang Ma
29726fe454eSLiang Ma``pause``
29826fe454eSLiang Ma  This will use ``rte_power_pause()`` or ``rte_pause()``
29926fe454eSLiang Ma  to avoid busy looping when there is no traffic.
30026fe454eSLiang Ma
30126fe454eSLiang Ma``scale``
30226fe454eSLiang Ma  This will use frequency scaling routines
30326fe454eSLiang Ma  available in the ``librte_power`` library.
3047580f973SDavid Hunt  The reaction time of the scale mode is longer
3057580f973SDavid Hunt  than the pause and monitor mode.
30626fe454eSLiang Ma
30726fe454eSLiang MaSee :doc:`Power Management<../prog_guide/power_man>` chapter
30826fe454eSLiang Main the DPDK Programmer's Guide for more details on PMD power management.
30926fe454eSLiang Ma
31026fe454eSLiang Ma.. code-block:: console
31126fe454eSLiang Ma
31226fe454eSLiang Ma        ./<build_dir>/examples/dpdk-l3fwd-power -l 1-3 -- -p 0x0f --config="(0,0,2),(0,1,3)" --pmd-mgmt=scale
31310db2a5bSTadhg Kearney
31410db2a5bSTadhg KearneySetting Uncore Values
31510db2a5bSTadhg Kearney---------------------
31610db2a5bSTadhg Kearney
31710db2a5bSTadhg KearneyUncore frequency can be adjusted through manipulating related sysfs entries
31810db2a5bSTadhg Kearneyto adjust the minimum and maximum uncore values.
31910db2a5bSTadhg KearneyThis will be set for each package and die on the SKU.
32010db2a5bSTadhg KearneyThe driver for enabling this is available from kernel version 5.6 and above.
32110db2a5bSTadhg KearneyThree options are available for setting uncore frequency:
32210db2a5bSTadhg Kearney
32310db2a5bSTadhg Kearney``-u``
32410db2a5bSTadhg Kearney  This will set uncore minimum and maximum frequencies to minimum possible value.
32510db2a5bSTadhg Kearney
32610db2a5bSTadhg Kearney``-U``
32710db2a5bSTadhg Kearney  This will set uncore minimum and maximum frequencies to maximum possible value.
32810db2a5bSTadhg Kearney
32910db2a5bSTadhg Kearney``-i``
33010db2a5bSTadhg Kearney  This will allow you to set the specific uncore frequency index that you want,
33110db2a5bSTadhg Kearney  by setting the uncore frequency to a frequency pointed by index.
33210db2a5bSTadhg Kearney  Frequency index's are set 100MHz apart from maximum to minimum.
33310db2a5bSTadhg Kearney  Frequency index values are in descending order,
33410db2a5bSTadhg Kearney  i.e., index 0 is maximum frequency index.
33510db2a5bSTadhg Kearney
33610db2a5bSTadhg Kearney.. code-block:: console
33710db2a5bSTadhg Kearney
33810db2a5bSTadhg Kearney   dpdk-l3fwd-power -l 1-3 -- -p 0x0f --config="(0,0,2),(0,1,3)" -i 1
339