1*2c900d09SJiayu Hu.. BSD LICENSE 2*2c900d09SJiayu Hu Copyright(c) 2017 Intel Corporation. All rights reserved. 3*2c900d09SJiayu Hu All rights reserved. 4*2c900d09SJiayu Hu 5*2c900d09SJiayu Hu Redistribution and use in source and binary forms, with or without 6*2c900d09SJiayu Hu modification, are permitted provided that the following conditions 7*2c900d09SJiayu Hu are met: 8*2c900d09SJiayu Hu 9*2c900d09SJiayu Hu * Redistributions of source code must retain the above copyright 10*2c900d09SJiayu Hu notice, this list of conditions and the following disclaimer. 11*2c900d09SJiayu Hu * Redistributions in binary form must reproduce the above copyright 12*2c900d09SJiayu Hu notice, this list of conditions and the following disclaimer in 13*2c900d09SJiayu Hu the documentation and/or other materials provided with the 14*2c900d09SJiayu Hu distribution. 15*2c900d09SJiayu Hu * Neither the name of Intel Corporation nor the names of its 16*2c900d09SJiayu Hu contributors may be used to endorse or promote products derived 17*2c900d09SJiayu Hu from this software without specific prior written permission. 18*2c900d09SJiayu Hu 19*2c900d09SJiayu Hu THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 20*2c900d09SJiayu Hu "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 21*2c900d09SJiayu Hu LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 22*2c900d09SJiayu Hu A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT 23*2c900d09SJiayu Hu OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 24*2c900d09SJiayu Hu SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 25*2c900d09SJiayu Hu LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 26*2c900d09SJiayu Hu DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 27*2c900d09SJiayu Hu THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 28*2c900d09SJiayu Hu (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 29*2c900d09SJiayu Hu OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 30*2c900d09SJiayu Hu 31*2c900d09SJiayu HuGeneric Receive Offload Library 32*2c900d09SJiayu Hu=============================== 33*2c900d09SJiayu Hu 34*2c900d09SJiayu HuGeneric Receive Offload (GRO) is a widely used SW-based offloading 35*2c900d09SJiayu Hutechnique to reduce per-packet processing overhead. It gains performance 36*2c900d09SJiayu Huby reassembling small packets into large ones. To enable more flexibility 37*2c900d09SJiayu Huto applications, DPDK implements GRO as a standalone library. Applications 38*2c900d09SJiayu Huexplicitly use the GRO library to merge small packets into large ones. 39*2c900d09SJiayu Hu 40*2c900d09SJiayu HuThe GRO library assumes all input packets have correct checksums. In 41*2c900d09SJiayu Huaddition, the GRO library doesn't re-calculate checksums for merged 42*2c900d09SJiayu Hupackets. If input packets are IP fragmented, the GRO library assumes 43*2c900d09SJiayu Huthey are complete packets (i.e. with L4 headers). 44*2c900d09SJiayu Hu 45*2c900d09SJiayu HuCurrently, the GRO library implements TCP/IPv4 packet reassembly. 46*2c900d09SJiayu Hu 47*2c900d09SJiayu HuReassembly Modes 48*2c900d09SJiayu Hu---------------- 49*2c900d09SJiayu Hu 50*2c900d09SJiayu HuThe GRO library provides two reassembly modes: lightweight and 51*2c900d09SJiayu Huheavyweight mode. If applications want to merge packets in a simple way, 52*2c900d09SJiayu Huthey can use the lightweight mode API. If applications want more 53*2c900d09SJiayu Hufine-grained controls, they can choose the heavyweight mode API. 54*2c900d09SJiayu Hu 55*2c900d09SJiayu HuLightweight Mode 56*2c900d09SJiayu Hu~~~~~~~~~~~~~~~~ 57*2c900d09SJiayu Hu 58*2c900d09SJiayu HuThe ``rte_gro_reassemble_burst()`` function is used for reassembly in 59*2c900d09SJiayu Hulightweight mode. It tries to merge N input packets at a time, where 60*2c900d09SJiayu HuN should be less than or equal to ``RTE_GRO_MAX_BURST_ITEM_NUM``. 61*2c900d09SJiayu Hu 62*2c900d09SJiayu HuIn each invocation, ``rte_gro_reassemble_burst()`` allocates temporary 63*2c900d09SJiayu Hureassembly tables for the desired GRO types. Note that the reassembly 64*2c900d09SJiayu Hutable is a table structure used to reassemble packets and different GRO 65*2c900d09SJiayu Hutypes (e.g. TCP/IPv4 GRO and TCP/IPv6 GRO) have different reassembly table 66*2c900d09SJiayu Hustructures. The ``rte_gro_reassemble_burst()`` function uses the reassembly 67*2c900d09SJiayu Hutables to merge the N input packets. 68*2c900d09SJiayu Hu 69*2c900d09SJiayu HuFor applications, performing GRO in lightweight mode is simple. They 70*2c900d09SJiayu Hujust need to invoke ``rte_gro_reassemble_burst()``. Applications can get 71*2c900d09SJiayu HuGROed packets as soon as ``rte_gro_reassemble_burst()`` returns. 72*2c900d09SJiayu Hu 73*2c900d09SJiayu HuHeavyweight Mode 74*2c900d09SJiayu Hu~~~~~~~~~~~~~~~~ 75*2c900d09SJiayu Hu 76*2c900d09SJiayu HuThe ``rte_gro_reassemble()`` function is used for reassembly in heavyweight 77*2c900d09SJiayu Humode. Compared with the lightweight mode, performing GRO in heavyweight mode 78*2c900d09SJiayu Huis relatively complicated. 79*2c900d09SJiayu Hu 80*2c900d09SJiayu HuBefore performing GRO, applications need to create a GRO context object 81*2c900d09SJiayu Huby calling ``rte_gro_ctx_create()``. A GRO context object holds the 82*2c900d09SJiayu Hureassembly tables of desired GRO types. Note that all update/lookup 83*2c900d09SJiayu Huoperations on the context object are not thread safe. So if different 84*2c900d09SJiayu Huprocesses or threads want to access the same context object simultaneously, 85*2c900d09SJiayu Husome external syncing mechanisms must be used. 86*2c900d09SJiayu Hu 87*2c900d09SJiayu HuOnce the GRO context is created, applications can then use the 88*2c900d09SJiayu Hu``rte_gro_reassemble()`` function to merge packets. In each invocation, 89*2c900d09SJiayu Hu``rte_gro_reassemble()`` tries to merge input packets with the packets 90*2c900d09SJiayu Huin the reassembly tables. If an input packet is an unsupported GRO type, 91*2c900d09SJiayu Huor other errors happen (e.g. SYN bit is set), ``rte_gro_reassemble()`` 92*2c900d09SJiayu Hureturns the packet to applications. Otherwise, the input packet is either 93*2c900d09SJiayu Humerged or inserted into a reassembly table. 94*2c900d09SJiayu Hu 95*2c900d09SJiayu HuWhen applications want to get GRO processed packets, they need to use 96*2c900d09SJiayu Hu``rte_gro_timeout_flush()`` to flush them from the tables manually. 97*2c900d09SJiayu Hu 98*2c900d09SJiayu HuTCP/IPv4 GRO 99*2c900d09SJiayu Hu------------ 100*2c900d09SJiayu Hu 101*2c900d09SJiayu HuTCP/IPv4 GRO supports merging small TCP/IPv4 packets into large ones, 102*2c900d09SJiayu Huusing a table structure called the TCP/IPv4 reassembly table. 103*2c900d09SJiayu Hu 104*2c900d09SJiayu HuTCP/IPv4 Reassembly Table 105*2c900d09SJiayu Hu~~~~~~~~~~~~~~~~~~~~~~~~~ 106*2c900d09SJiayu Hu 107*2c900d09SJiayu HuA TCP/IPv4 reassembly table includes a "key" array and an "item" array. 108*2c900d09SJiayu HuThe key array keeps the criteria to merge packets and the item array 109*2c900d09SJiayu Hukeeps the packet information. 110*2c900d09SJiayu Hu 111*2c900d09SJiayu HuEach key in the key array points to an item group, which consists of 112*2c900d09SJiayu Hupackets which have the same criteria values but can't be merged. A key 113*2c900d09SJiayu Huin the key array includes two parts: 114*2c900d09SJiayu Hu 115*2c900d09SJiayu Hu* ``criteria``: the criteria to merge packets. If two packets can be 116*2c900d09SJiayu Hu merged, they must have the same criteria values. 117*2c900d09SJiayu Hu 118*2c900d09SJiayu Hu* ``start_index``: the item array index of the first packet in the item 119*2c900d09SJiayu Hu group. 120*2c900d09SJiayu Hu 121*2c900d09SJiayu HuEach element in the item array keeps the information of a packet. An item 122*2c900d09SJiayu Huin the item array mainly includes three parts: 123*2c900d09SJiayu Hu 124*2c900d09SJiayu Hu* ``firstseg``: the mbuf address of the first segment of the packet. 125*2c900d09SJiayu Hu 126*2c900d09SJiayu Hu* ``lastseg``: the mbuf address of the last segment of the packet. 127*2c900d09SJiayu Hu 128*2c900d09SJiayu Hu* ``next_pkt_index``: the item array index of the next packet in the same 129*2c900d09SJiayu Hu item group. TCP/IPv4 GRO uses ``next_pkt_index`` to chain the packets 130*2c900d09SJiayu Hu that have the same criteria value but can't be merged together. 131*2c900d09SJiayu Hu 132*2c900d09SJiayu HuProcedure to Reassemble a Packet 133*2c900d09SJiayu Hu~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 134*2c900d09SJiayu Hu 135*2c900d09SJiayu HuTo reassemble an incoming packet needs three steps: 136*2c900d09SJiayu Hu 137*2c900d09SJiayu Hu#. Check if the packet should be processed. Packets with one of the 138*2c900d09SJiayu Hu following properties aren't processed and are returned immediately: 139*2c900d09SJiayu Hu 140*2c900d09SJiayu Hu * FIN, SYN, RST, URG, PSH, ECE or CWR bit is set. 141*2c900d09SJiayu Hu 142*2c900d09SJiayu Hu * L4 payload length is 0. 143*2c900d09SJiayu Hu 144*2c900d09SJiayu Hu#. Traverse the key array to find a key which has the same criteria 145*2c900d09SJiayu Hu value with the incoming packet. If found, go to the next step. 146*2c900d09SJiayu Hu Otherwise, insert a new key and a new item for the packet. 147*2c900d09SJiayu Hu 148*2c900d09SJiayu Hu#. Locate the first packet in the item group via ``start_index``. Then 149*2c900d09SJiayu Hu traverse all packets in the item group via ``next_pkt_index``. If a 150*2c900d09SJiayu Hu packet is found which can be merged with the incoming one, merge them 151*2c900d09SJiayu Hu together. If one isn't found, insert the packet into this item group. 152*2c900d09SJiayu Hu Note that to merge two packets is to link them together via mbuf's 153*2c900d09SJiayu Hu ``next`` field. 154*2c900d09SJiayu Hu 155*2c900d09SJiayu HuWhen packets are flushed from the reassembly table, TCP/IPv4 GRO updates 156*2c900d09SJiayu Hupacket header fields for the merged packets. Note that before reassembling 157*2c900d09SJiayu Huthe packet, TCP/IPv4 GRO doesn't check if the checksums of packets are 158*2c900d09SJiayu Hucorrect. Also, TCP/IPv4 GRO doesn't re-calculate checksums for merged 159*2c900d09SJiayu Hupackets. 160