1fc1f2750SBernard Iremonger.. BSD LICENSE 2fc1f2750SBernard Iremonger Copyright(c) 2010-2014 Intel Corporation. All rights reserved. 3fc1f2750SBernard Iremonger All rights reserved. 4fc1f2750SBernard Iremonger 5fc1f2750SBernard Iremonger Redistribution and use in source and binary forms, with or without 6fc1f2750SBernard Iremonger modification, are permitted provided that the following conditions 7fc1f2750SBernard Iremonger are met: 8fc1f2750SBernard Iremonger 9fc1f2750SBernard Iremonger * Redistributions of source code must retain the above copyright 10fc1f2750SBernard Iremonger notice, this list of conditions and the following disclaimer. 11fc1f2750SBernard Iremonger * Redistributions in binary form must reproduce the above copyright 12fc1f2750SBernard Iremonger notice, this list of conditions and the following disclaimer in 13fc1f2750SBernard Iremonger the documentation and/or other materials provided with the 14fc1f2750SBernard Iremonger distribution. 15fc1f2750SBernard Iremonger * Neither the name of Intel Corporation nor the names of its 16fc1f2750SBernard Iremonger contributors may be used to endorse or promote products derived 17fc1f2750SBernard Iremonger from this software without specific prior written permission. 18fc1f2750SBernard Iremonger 19fc1f2750SBernard Iremonger THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 20fc1f2750SBernard Iremonger "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 21fc1f2750SBernard Iremonger LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 22fc1f2750SBernard Iremonger A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT 23fc1f2750SBernard Iremonger OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 24fc1f2750SBernard Iremonger SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 25fc1f2750SBernard Iremonger LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 26fc1f2750SBernard Iremonger DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 27fc1f2750SBernard Iremonger THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 28fc1f2750SBernard Iremonger (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 29fc1f2750SBernard Iremonger OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 30fc1f2750SBernard Iremonger 31fc1f2750SBernard IremongerPacket Distributor Library 32fc1f2750SBernard Iremonger========================== 33fc1f2750SBernard Iremonger 3448624fd9SSiobhan ButlerThe DPDK Packet Distributor library is a library designed to be used for dynamic load balancing of traffic 35fc1f2750SBernard Iremongerwhile supporting single packet at a time operation. 36fc1f2750SBernard IremongerWhen using this library, the logical cores in use are to be considered in two roles: firstly a distributor lcore, 37fc1f2750SBernard Iremongerwhich is responsible for load balancing or distributing packets, 38fc1f2750SBernard Iremongerand a set of worker lcores which are responsible for receiving the packets from the distributor and operating on them. 39fc1f2750SBernard IremongerThe model of operation is shown in the diagram below. 40fc1f2750SBernard Iremonger 414a22e6eeSJohn McNamara.. figure:: img/packet_distributor1.* 424a22e6eeSJohn McNamara 434a22e6eeSJohn McNamara Packet Distributor mode of operation 444a22e6eeSJohn McNamara 45*7e0bb299SDavid HuntThere are two modes of operation of the API in the distributor library, 46*7e0bb299SDavid Huntone which sends one packet at a time to workers using 32-bits for flow_id, 47*7e0bb299SDavid Huntand an optimized mode which sends bursts of up to 8 packets at a time to workers, using 15 bits of flow_id. 48*7e0bb299SDavid HuntThe mode is selected by the type field in the ``rte_distributor_create()`` function. 49fc1f2750SBernard Iremonger 50fc1f2750SBernard IremongerDistributor Core Operation 51fc1f2750SBernard Iremonger-------------------------- 52fc1f2750SBernard Iremonger 53fc1f2750SBernard IremongerThe distributor core does the majority of the processing for ensuring that packets are fairly shared among workers. 54fc1f2750SBernard IremongerThe operation of the distributor is as follows: 55fc1f2750SBernard Iremonger 56fc1f2750SBernard Iremonger#. Packets are passed to the distributor component by having the distributor lcore thread call the "rte_distributor_process()" API 57fc1f2750SBernard Iremonger 58fc1f2750SBernard Iremonger#. The worker lcores all share a single cache line with the distributor core in order to pass messages and packets to and from the worker. 59fc1f2750SBernard Iremonger The process API call will poll all the worker cache lines to see what workers are requesting packets. 60fc1f2750SBernard Iremonger 61fc1f2750SBernard Iremonger#. As workers request packets, the distributor takes packets from the set of packets passed in and distributes them to the workers. 62fc1f2750SBernard Iremonger As it does so, it examines the "tag" -- stored in the RSS hash field in the mbuf -- for each packet 63fc1f2750SBernard Iremonger and records what tags are being processed by each worker. 64fc1f2750SBernard Iremonger 65fc1f2750SBernard Iremonger#. If the next packet in the input set has a tag which is already being processed by a worker, 66fc1f2750SBernard Iremonger then that packet will be queued up for processing by that worker 67fc1f2750SBernard Iremonger and given to it in preference to other packets when that work next makes a request for work. 68fc1f2750SBernard Iremonger This ensures that no two packets with the same tag are processed in parallel, 69fc1f2750SBernard Iremonger and that all packets with the same tag are processed in input order. 70fc1f2750SBernard Iremonger 71fc1f2750SBernard Iremonger#. Once all input packets passed to the process API have either been distributed to workers 72fc1f2750SBernard Iremonger or been queued up for a worker which is processing a given tag, 73fc1f2750SBernard Iremonger then the process API returns to the caller. 74fc1f2750SBernard Iremonger 75fc1f2750SBernard IremongerOther functions which are available to the distributor lcore are: 76fc1f2750SBernard Iremonger 77fc1f2750SBernard Iremonger* rte_distributor_returned_pkts() 78fc1f2750SBernard Iremonger 79fc1f2750SBernard Iremonger* rte_distributor_flush() 80fc1f2750SBernard Iremonger 81fc1f2750SBernard Iremonger* rte_distributor_clear_returns() 82fc1f2750SBernard Iremonger 83fc1f2750SBernard IremongerOf these the most important API call is "rte_distributor_returned_pkts()" 84fc1f2750SBernard Iremongerwhich should only be called on the lcore which also calls the process API. 85fc1f2750SBernard IremongerIt returns to the caller all packets which have finished processing by all worker cores. 86fc1f2750SBernard IremongerWithin this set of returned packets, all packets sharing the same tag will be returned in their original order. 87fc1f2750SBernard Iremonger 88fc1f2750SBernard Iremonger**NOTE:** 89fc1f2750SBernard IremongerIf worker lcores buffer up packets internally for transmission in bulk afterwards, 90fc1f2750SBernard Iremongerthe packets sharing a tag will likely get out of order. 91fc1f2750SBernard IremongerOnce a worker lcore requests a new packet, the distributor assumes that it has completely finished with the previous packet and 92fc1f2750SBernard Iremongertherefore that additional packets with the same tag can safely be distributed to other workers -- 93fc1f2750SBernard Iremongerwho may then flush their buffered packets sooner and cause packets to get out of order. 94fc1f2750SBernard Iremonger 95fc1f2750SBernard Iremonger**NOTE:** 96fc1f2750SBernard IremongerNo packet ordering guarantees are made about packets which do not share a common packet tag. 97fc1f2750SBernard Iremonger 98fc1f2750SBernard IremongerUsing the process and returned_pkts API, the following application workflow can be used, 99fc1f2750SBernard Iremongerwhile allowing packet order within a packet flow -- identified by a tag -- to be maintained. 100fc1f2750SBernard Iremonger 101fc1f2750SBernard Iremonger 1024a22e6eeSJohn McNamara.. figure:: img/packet_distributor2.* 1034a22e6eeSJohn McNamara 1044a22e6eeSJohn McNamara Application workflow 1054a22e6eeSJohn McNamara 106fc1f2750SBernard Iremonger 107fc1f2750SBernard IremongerThe flush and clear_returns API calls, mentioned previously, 108fc1f2750SBernard Iremongerare likely of less use that the process and returned_pkts APIS, and are principally provided to aid in unit testing of the library. 10948624fd9SSiobhan ButlerDescriptions of these functions and their use can be found in the DPDK API Reference document. 110fc1f2750SBernard Iremonger 111fc1f2750SBernard IremongerWorker Operation 112fc1f2750SBernard Iremonger---------------- 113fc1f2750SBernard Iremonger 114fc1f2750SBernard IremongerWorker cores are the cores which do the actual manipulation of the packets distributed by the packet distributor. 115fc1f2750SBernard IremongerEach worker calls "rte_distributor_get_pkt()" API to request a new packet when it has finished processing the previous one. 116fc1f2750SBernard Iremonger[The previous packet should be returned to the distributor component by passing it as the final parameter to this API call.] 117fc1f2750SBernard Iremonger 118fc1f2750SBernard IremongerSince it may be desirable to vary the number of worker cores, depending on the traffic load 119fc1f2750SBernard Iremongeri.e. to save power at times of lighter load, 120fc1f2750SBernard Iremongerit is possible to have a worker stop processing packets by calling "rte_distributor_return_pkt()" to indicate that 121fc1f2750SBernard Iremongerit has finished the current packet and does not want a new one. 122