xref: /dpdk/doc/guides/prog_guide/generic_receive_offload_lib.rst (revision 2c900d09055eb5391a52db46d583c914105f1178)
1*2c900d09SJiayu Hu..  BSD LICENSE
2*2c900d09SJiayu Hu    Copyright(c) 2017 Intel Corporation. All rights reserved.
3*2c900d09SJiayu Hu    All rights reserved.
4*2c900d09SJiayu Hu
5*2c900d09SJiayu Hu    Redistribution and use in source and binary forms, with or without
6*2c900d09SJiayu Hu    modification, are permitted provided that the following conditions
7*2c900d09SJiayu Hu    are met:
8*2c900d09SJiayu Hu
9*2c900d09SJiayu Hu    * Redistributions of source code must retain the above copyright
10*2c900d09SJiayu Hu    notice, this list of conditions and the following disclaimer.
11*2c900d09SJiayu Hu    * Redistributions in binary form must reproduce the above copyright
12*2c900d09SJiayu Hu    notice, this list of conditions and the following disclaimer in
13*2c900d09SJiayu Hu    the documentation and/or other materials provided with the
14*2c900d09SJiayu Hu    distribution.
15*2c900d09SJiayu Hu    * Neither the name of Intel Corporation nor the names of its
16*2c900d09SJiayu Hu    contributors may be used to endorse or promote products derived
17*2c900d09SJiayu Hu    from this software without specific prior written permission.
18*2c900d09SJiayu Hu
19*2c900d09SJiayu Hu    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
20*2c900d09SJiayu Hu    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
21*2c900d09SJiayu Hu    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
22*2c900d09SJiayu Hu    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
23*2c900d09SJiayu Hu    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
24*2c900d09SJiayu Hu    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
25*2c900d09SJiayu Hu    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
26*2c900d09SJiayu Hu    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
27*2c900d09SJiayu Hu    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
28*2c900d09SJiayu Hu    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
29*2c900d09SJiayu Hu    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
30*2c900d09SJiayu Hu
31*2c900d09SJiayu HuGeneric Receive Offload Library
32*2c900d09SJiayu Hu===============================
33*2c900d09SJiayu Hu
34*2c900d09SJiayu HuGeneric Receive Offload (GRO) is a widely used SW-based offloading
35*2c900d09SJiayu Hutechnique to reduce per-packet processing overhead. It gains performance
36*2c900d09SJiayu Huby reassembling small packets into large ones. To enable more flexibility
37*2c900d09SJiayu Huto applications, DPDK implements GRO as a standalone library. Applications
38*2c900d09SJiayu Huexplicitly use the GRO library to merge small packets into large ones.
39*2c900d09SJiayu Hu
40*2c900d09SJiayu HuThe GRO library assumes all input packets have correct checksums. In
41*2c900d09SJiayu Huaddition, the GRO library doesn't re-calculate checksums for merged
42*2c900d09SJiayu Hupackets. If input packets are IP fragmented, the GRO library assumes
43*2c900d09SJiayu Huthey are complete packets (i.e. with L4 headers).
44*2c900d09SJiayu Hu
45*2c900d09SJiayu HuCurrently, the GRO library implements TCP/IPv4 packet reassembly.
46*2c900d09SJiayu Hu
47*2c900d09SJiayu HuReassembly Modes
48*2c900d09SJiayu Hu----------------
49*2c900d09SJiayu Hu
50*2c900d09SJiayu HuThe GRO library provides two reassembly modes: lightweight and
51*2c900d09SJiayu Huheavyweight mode. If applications want to merge packets in a simple way,
52*2c900d09SJiayu Huthey can use the lightweight mode API. If applications want more
53*2c900d09SJiayu Hufine-grained controls, they can choose the heavyweight mode API.
54*2c900d09SJiayu Hu
55*2c900d09SJiayu HuLightweight Mode
56*2c900d09SJiayu Hu~~~~~~~~~~~~~~~~
57*2c900d09SJiayu Hu
58*2c900d09SJiayu HuThe ``rte_gro_reassemble_burst()`` function is used for reassembly in
59*2c900d09SJiayu Hulightweight mode. It tries to merge N input packets at a time, where
60*2c900d09SJiayu HuN should be less than or equal to ``RTE_GRO_MAX_BURST_ITEM_NUM``.
61*2c900d09SJiayu Hu
62*2c900d09SJiayu HuIn each invocation, ``rte_gro_reassemble_burst()`` allocates temporary
63*2c900d09SJiayu Hureassembly tables for the desired GRO types. Note that the reassembly
64*2c900d09SJiayu Hutable is a table structure used to reassemble packets and different GRO
65*2c900d09SJiayu Hutypes (e.g. TCP/IPv4 GRO and TCP/IPv6 GRO) have different reassembly table
66*2c900d09SJiayu Hustructures. The ``rte_gro_reassemble_burst()`` function uses the reassembly
67*2c900d09SJiayu Hutables to merge the N input packets.
68*2c900d09SJiayu Hu
69*2c900d09SJiayu HuFor applications, performing GRO in lightweight mode is simple. They
70*2c900d09SJiayu Hujust need to invoke ``rte_gro_reassemble_burst()``. Applications can get
71*2c900d09SJiayu HuGROed packets as soon as ``rte_gro_reassemble_burst()`` returns.
72*2c900d09SJiayu Hu
73*2c900d09SJiayu HuHeavyweight Mode
74*2c900d09SJiayu Hu~~~~~~~~~~~~~~~~
75*2c900d09SJiayu Hu
76*2c900d09SJiayu HuThe ``rte_gro_reassemble()`` function is used for reassembly in heavyweight
77*2c900d09SJiayu Humode. Compared with the lightweight mode, performing GRO in heavyweight mode
78*2c900d09SJiayu Huis relatively complicated.
79*2c900d09SJiayu Hu
80*2c900d09SJiayu HuBefore performing GRO, applications need to create a GRO context object
81*2c900d09SJiayu Huby calling ``rte_gro_ctx_create()``. A GRO context object holds the
82*2c900d09SJiayu Hureassembly tables of desired GRO types. Note that all update/lookup
83*2c900d09SJiayu Huoperations on the context object are not thread safe. So if different
84*2c900d09SJiayu Huprocesses or threads want to access the same context object simultaneously,
85*2c900d09SJiayu Husome external syncing mechanisms must be used.
86*2c900d09SJiayu Hu
87*2c900d09SJiayu HuOnce the GRO context is created, applications can then use the
88*2c900d09SJiayu Hu``rte_gro_reassemble()`` function to merge packets. In each invocation,
89*2c900d09SJiayu Hu``rte_gro_reassemble()`` tries to merge input packets with the packets
90*2c900d09SJiayu Huin the reassembly tables. If an input packet is an unsupported GRO type,
91*2c900d09SJiayu Huor other errors happen (e.g. SYN bit is set), ``rte_gro_reassemble()``
92*2c900d09SJiayu Hureturns the packet to applications. Otherwise, the input packet is either
93*2c900d09SJiayu Humerged or inserted into a reassembly table.
94*2c900d09SJiayu Hu
95*2c900d09SJiayu HuWhen applications want to get GRO processed packets, they need to use
96*2c900d09SJiayu Hu``rte_gro_timeout_flush()`` to flush them from the tables manually.
97*2c900d09SJiayu Hu
98*2c900d09SJiayu HuTCP/IPv4 GRO
99*2c900d09SJiayu Hu------------
100*2c900d09SJiayu Hu
101*2c900d09SJiayu HuTCP/IPv4 GRO supports merging small TCP/IPv4 packets into large ones,
102*2c900d09SJiayu Huusing a table structure called the TCP/IPv4 reassembly table.
103*2c900d09SJiayu Hu
104*2c900d09SJiayu HuTCP/IPv4 Reassembly Table
105*2c900d09SJiayu Hu~~~~~~~~~~~~~~~~~~~~~~~~~
106*2c900d09SJiayu Hu
107*2c900d09SJiayu HuA TCP/IPv4 reassembly table includes a "key" array and an "item" array.
108*2c900d09SJiayu HuThe key array keeps the criteria to merge packets and the item array
109*2c900d09SJiayu Hukeeps the packet information.
110*2c900d09SJiayu Hu
111*2c900d09SJiayu HuEach key in the key array points to an item group, which consists of
112*2c900d09SJiayu Hupackets which have the same criteria values but can't be merged. A key
113*2c900d09SJiayu Huin the key array includes two parts:
114*2c900d09SJiayu Hu
115*2c900d09SJiayu Hu* ``criteria``: the criteria to merge packets. If two packets can be
116*2c900d09SJiayu Hu  merged, they must have the same criteria values.
117*2c900d09SJiayu Hu
118*2c900d09SJiayu Hu* ``start_index``: the item array index of the first packet in the item
119*2c900d09SJiayu Hu  group.
120*2c900d09SJiayu Hu
121*2c900d09SJiayu HuEach element in the item array keeps the information of a packet. An item
122*2c900d09SJiayu Huin the item array mainly includes three parts:
123*2c900d09SJiayu Hu
124*2c900d09SJiayu Hu* ``firstseg``: the mbuf address of the first segment of the packet.
125*2c900d09SJiayu Hu
126*2c900d09SJiayu Hu* ``lastseg``: the mbuf address of the last segment of the packet.
127*2c900d09SJiayu Hu
128*2c900d09SJiayu Hu* ``next_pkt_index``: the item array index of the next packet in the same
129*2c900d09SJiayu Hu  item group. TCP/IPv4 GRO uses ``next_pkt_index`` to chain the packets
130*2c900d09SJiayu Hu  that have the same criteria value but can't be merged together.
131*2c900d09SJiayu Hu
132*2c900d09SJiayu HuProcedure to Reassemble a Packet
133*2c900d09SJiayu Hu~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
134*2c900d09SJiayu Hu
135*2c900d09SJiayu HuTo reassemble an incoming packet needs three steps:
136*2c900d09SJiayu Hu
137*2c900d09SJiayu Hu#. Check if the packet should be processed. Packets with one of the
138*2c900d09SJiayu Hu   following properties aren't processed and are returned immediately:
139*2c900d09SJiayu Hu
140*2c900d09SJiayu Hu   * FIN, SYN, RST, URG, PSH, ECE or CWR bit is set.
141*2c900d09SJiayu Hu
142*2c900d09SJiayu Hu   * L4 payload length is 0.
143*2c900d09SJiayu Hu
144*2c900d09SJiayu Hu#.  Traverse the key array to find a key which has the same criteria
145*2c900d09SJiayu Hu    value with the incoming packet. If found, go to the next step.
146*2c900d09SJiayu Hu    Otherwise, insert a new key and a new item for the packet.
147*2c900d09SJiayu Hu
148*2c900d09SJiayu Hu#. Locate the first packet in the item group via ``start_index``. Then
149*2c900d09SJiayu Hu   traverse all packets in the item group via ``next_pkt_index``. If a
150*2c900d09SJiayu Hu   packet is found which can be merged with the incoming one, merge them
151*2c900d09SJiayu Hu   together. If one isn't found, insert the packet into this item group.
152*2c900d09SJiayu Hu   Note that to merge two packets is to link them together via mbuf's
153*2c900d09SJiayu Hu   ``next`` field.
154*2c900d09SJiayu Hu
155*2c900d09SJiayu HuWhen packets are flushed from the reassembly table, TCP/IPv4 GRO updates
156*2c900d09SJiayu Hupacket header fields for the merged packets. Note that before reassembling
157*2c900d09SJiayu Huthe packet, TCP/IPv4 GRO doesn't check if the checksums of packets are
158*2c900d09SJiayu Hucorrect. Also, TCP/IPv4 GRO doesn't re-calculate checksums for merged
159*2c900d09SJiayu Hupackets.
160