xref: /dpdk/doc/guides/sample_app_ug/dist_app.rst (revision 3429d6dd5cdcb4fa3c8c78115a4f2d9ba3157dea)
15630257fSFerruh Yigit..  SPDX-License-Identifier: BSD-3-Clause
25630257fSFerruh Yigit    Copyright(c) 2010-2014 Intel Corporation.
360643134SSiobhan Butler
460643134SSiobhan ButlerDistributor Sample Application
560643134SSiobhan Butler==============================
660643134SSiobhan Butler
760643134SSiobhan ButlerThe distributor sample application is a simple example of packet distribution
8561e0301SDavid Huntto cores using the Data Plane Development Kit (DPDK). It also makes use of
9561e0301SDavid HuntIntel Speed Select Technology - Base Frequency (Intel SST-BF) to pin the
10561e0301SDavid Huntdistributor to the higher frequency core if available.
1160643134SSiobhan Butler
1260643134SSiobhan ButlerOverview
1360643134SSiobhan Butler--------
1460643134SSiobhan Butler
1560643134SSiobhan ButlerThe distributor application performs the distribution of packets that are received
1660643134SSiobhan Butleron an RX_PORT to different cores. When processed by the cores, the destination
1760643134SSiobhan Butlerport of a packet is the port from the enabled port mask adjacent to the one on
1860643134SSiobhan Butlerwhich the packet was received, that is, if the first four ports are enabled
1960643134SSiobhan Butler(port mask 0xf), ports 0 and 1 RX/TX into each other, and ports 2 and 3 RX/TX
2060643134SSiobhan Butlerinto each other.
2160643134SSiobhan Butler
2260643134SSiobhan ButlerThis application can be used to benchmark performance using the traffic
2360643134SSiobhan Butlergenerator as shown in the figure below.
2460643134SSiobhan Butler
254a22e6eeSJohn McNamara.. _figure_dist_perf:
2660643134SSiobhan Butler
274a22e6eeSJohn McNamara.. figure:: img/dist_perf.*
2860643134SSiobhan Butler
294a22e6eeSJohn McNamara   Performance Benchmarking Setup (Basic Environment)
304a22e6eeSJohn McNamara
3160643134SSiobhan ButlerCompiling the Application
3260643134SSiobhan Butler-------------------------
3360643134SSiobhan Butler
347cacb056SHerakliusz LipiecTo compile the sample application see :doc:`compiling`.
3560643134SSiobhan Butler
367cacb056SHerakliusz LipiecThe application is located in the ``distributor`` sub-directory.
3760643134SSiobhan Butler
3860643134SSiobhan ButlerRunning the Application
3960643134SSiobhan Butler-----------------------
4060643134SSiobhan Butler
4160643134SSiobhan Butler#. The application has a number of command line options:
4260643134SSiobhan Butler
4360643134SSiobhan Butler   ..  code-block:: console
4460643134SSiobhan Butler
45*3429d6ddSAbdullah Ömer Yamaç       ./<build-dir>/examples/dpdk-distributor [EAL options] -- -p PORTMASK [-c]
4660643134SSiobhan Butler
4760643134SSiobhan Butler   where,
4860643134SSiobhan Butler
4960643134SSiobhan Butler   *   -p PORTMASK: Hexadecimal bitmask of ports to configure
50*3429d6ddSAbdullah Ömer Yamaç   *   -c: Combines the RX core with distribution core
5160643134SSiobhan Butler
52218c4e68SBruce Richardson#. To run the application in linux environment with 10 lcores, 4 ports,
5360643134SSiobhan Butler   issue the command:
5460643134SSiobhan Butler
5560643134SSiobhan Butler   ..  code-block:: console
5660643134SSiobhan Butler
57e2a94f9aSCiara Power       $ ./<build-dir>/examples/dpdk-distributor -l 1-9,22 -n 4 -- -p f
5860643134SSiobhan Butler
5960643134SSiobhan Butler#. Refer to the DPDK Getting Started Guide for general information on running
6060643134SSiobhan Butler   applications and the Environment Abstraction Layer (EAL) options.
6160643134SSiobhan Butler
6260643134SSiobhan ButlerExplanation
6360643134SSiobhan Butler-----------
6460643134SSiobhan Butler
6589107b55SDavid HuntThe distributor application consists of four types of threads: a receive
6689107b55SDavid Huntthread (``lcore_rx()``), a distributor thread (``lcore_dist()``), a set of
6789107b55SDavid Huntworker threads (``lcore_worker()``), and a transmit thread(``lcore_tx()``).
6889107b55SDavid HuntHow these threads work together is shown in :numref:`figure_dist_app` below.
6989107b55SDavid HuntThe ``main()`` function launches  threads of these four types.  Each thread
7089107b55SDavid Hunthas a while loop which will be doing processing and which is terminated
7189107b55SDavid Huntonly upon SIGINT or ctrl+C.
7260643134SSiobhan Butler
7389107b55SDavid HuntThe receive thread receives the packets using ``rte_eth_rx_burst()`` and will
7489107b55SDavid Huntenqueue them to an rte_ring. The distributor thread will dequeue the packets
7589107b55SDavid Huntfrom the ring and assign them to workers (using ``rte_distributor_process()`` API).
7689107b55SDavid HuntThis assignment is based on the tag (or flow ID) of the packet - indicated by
7789107b55SDavid Huntthe hash field in the mbuf. For IP traffic, this field is automatically filled
7889107b55SDavid Huntby the NIC with the "usr" hash value for the packet, which works as a per-flow
7989107b55SDavid Hunttag.  The distributor thread communicates with the worker threads using a
8089107b55SDavid Huntcache-line swapping mechanism, passing up to 8 mbuf pointers at a time
8189107b55SDavid Hunt(one cache line) to each worker.
8260643134SSiobhan Butler
8360643134SSiobhan ButlerMore than one worker thread can exist as part of the application, and these
8460643134SSiobhan Butlerworker threads do simple packet processing by requesting packets from
8560643134SSiobhan Butlerthe distributor, doing a simple XOR operation on the input port mbuf field
8660643134SSiobhan Butler(to indicate the output port which will be used later for packet transmission)
8789107b55SDavid Huntand then finally returning the packets back to the distributor thread.
8860643134SSiobhan Butler
8989107b55SDavid HuntThe distributor thread will then call the distributor api
9089107b55SDavid Hunt``rte_distributor_returned_pkts()`` to get the processed packets, and will enqueue
9189107b55SDavid Huntthem to another rte_ring for transfer to the TX thread for transmission on the
9289107b55SDavid Huntoutput port. The transmit thread will dequeue the packets from the ring and
9389107b55SDavid Hunttransmit them on the output port specified in packet mbuf.
9460643134SSiobhan Butler
9560643134SSiobhan ButlerUsers who wish to terminate the running of the application have to press ctrl+C
9660643134SSiobhan Butler(or send SIGINT to the app). Upon this signal, a signal handler provided
9760643134SSiobhan Butlerin the application will terminate all running threads gracefully and print
9860643134SSiobhan Butlerfinal statistics to the user.
9960643134SSiobhan Butler
1004a22e6eeSJohn McNamara.. _figure_dist_app:
10160643134SSiobhan Butler
1024a22e6eeSJohn McNamara.. figure:: img/dist_app.*
10360643134SSiobhan Butler
1044a22e6eeSJohn McNamara   Distributor Sample Application Layout
1054a22e6eeSJohn McNamara
10660643134SSiobhan Butler
107561e0301SDavid HuntIntel SST-BF Support
108561e0301SDavid Hunt--------------------
109561e0301SDavid Hunt
110561e0301SDavid HuntIn DPDK 19.05, support was added to the power management library for
111561e0301SDavid HuntIntel-SST-BF, a technology that allows some cores to run at a higher
112561e0301SDavid Huntfrequency than others. An application note for Intel SST-BF is available,
113561e0301SDavid Huntand is entitled
114561e0301SDavid Hunt`Intel Speed Select Technology – Base Frequency - Enhancing Performance <https://builders.intel.com/docs/networkbuilders/intel-speed-select-technology-base-frequency-enhancing-performance.pdf>`_
115561e0301SDavid Hunt
116561e0301SDavid HuntThe distributor application was also enhanced to be aware of these higher
117561e0301SDavid Huntfrequency SST-BF cores, and when starting the application, if high frequency
118561e0301SDavid HuntSST-BF cores are present in the core mask, the application will identify these
119561e0301SDavid Huntcores and pin the workloads appropriately. The distributor core is usually
120561e0301SDavid Huntthe bottleneck, so this is given first choice of the high frequency SST-BF
121561e0301SDavid Huntcores, followed by the rx core and the tx core.
122561e0301SDavid Hunt
12360643134SSiobhan ButlerDebug Logging Support
12460643134SSiobhan Butler---------------------
12560643134SSiobhan Butler
12660643134SSiobhan ButlerDebug logging is provided as part of the application; the user needs to uncomment
12760643134SSiobhan Butlerthe line "#define DEBUG" defined in start of the application in main.c to enable debug logs.
12860643134SSiobhan Butler
12960643134SSiobhan ButlerStatistics
13060643134SSiobhan Butler----------
13160643134SSiobhan Butler
13289107b55SDavid HuntThe main function will print statistics on the console every second. These
13389107b55SDavid Huntstatistics include the number of packets enqueued and dequeued at each stage
13489107b55SDavid Huntin the application, and also key statistics per worker, including how many
13589107b55SDavid Huntpackets of each burst size (1-8) were sent to each worker thread.
13660643134SSiobhan Butler
13760643134SSiobhan ButlerApplication Initialization
13860643134SSiobhan Butler--------------------------
13960643134SSiobhan Butler
14060643134SSiobhan ButlerCommand line parsing is done in the same way as it is done in the L2 Forwarding Sample
141513b0723SMauricio Vasquez BApplication. See :ref:`l2_fwd_app_cmd_arguments`.
14260643134SSiobhan Butler
14360643134SSiobhan ButlerMbuf pool initialization is done in the same way as it is done in the L2 Forwarding
144513b0723SMauricio Vasquez BSample Application. See :ref:`l2_fwd_app_mbuf_init`.
14560643134SSiobhan Butler
14660643134SSiobhan ButlerDriver Initialization is done in same way as it is done in the L2 Forwarding Sample
147513b0723SMauricio Vasquez BApplication. See :ref:`l2_fwd_app_dvr_init`.
14860643134SSiobhan Butler
14960643134SSiobhan ButlerRX queue initialization is done in the same way as it is done in the L2 Forwarding
150513b0723SMauricio Vasquez BSample Application. See :ref:`l2_fwd_app_rx_init`.
15160643134SSiobhan Butler
15260643134SSiobhan ButlerTX queue initialization is done in the same way as it is done in the L2 Forwarding
153513b0723SMauricio Vasquez BSample Application. See :ref:`l2_fwd_app_tx_init`.
154