xref: /dpdk/doc/guides/prog_guide/generic_segmentation_offload_lib.rst (revision 41dd9a6bc2d9c6e20e139ad713cc9d172572dd43)
1..  SPDX-License-Identifier: BSD-3-Clause
2    Copyright(c) 2017 Intel Corporation.
3
4Generic Segmentation Offload (GSO) Library
5==========================================
6
7Overview
8--------
9Generic Segmentation Offload (GSO) is a widely used software implementation of
10TCP Segmentation Offload (TSO), which reduces per-packet processing overhead.
11Much like TSO, GSO gains performance by enabling upper layer applications to
12process a smaller number of large packets (e.g. MTU size of 64KB), instead of
13processing higher numbers of small packets (e.g. MTU size of 1500B), thus
14reducing per-packet overhead.
15
16For example, GSO allows guest kernel stacks to transmit over-sized TCP segments
17that far exceed the kernel interface's MTU; this eliminates the need to segment
18packets within the guest, and improves the data-to-overhead ratio of both the
19guest-host link, and PCI bus. The expectation of the guest network stack in this
20scenario is that segmentation of egress frames will take place either in the NIC
21HW, or where that hardware capability is unavailable, either in the host
22application, or network stack.
23
24Bearing that in mind, the GSO library enables DPDK applications to segment
25packets in software. Note however, that GSO is implemented as a standalone
26library, and not via a 'fallback' mechanism (i.e. for when TSO is unsupported
27in the underlying hardware); that is, applications must explicitly invoke the
28GSO library to segment packets, they also must call ``rte_pktmbuf_free()``
29to free mbuf GSO segments attached after calling ``rte_gso_segment()``.
30The size of GSO segments (``segsz``) is configurable by the application.
31
32Limitations
33-----------
34
35#. The GSO library doesn't check if input packets have correct checksums.
36
37#. In addition, the GSO library doesn't re-calculate checksums for segmented
38   packets (that task is left to the application).
39
40#. IP fragments are unsupported by the GSO library.
41
42#. The egress interface's driver must support multi-segment packets.
43
44#. Currently, the GSO library supports the following IPv4 packet types:
45
46 - TCP
47 - UDP
48 - VXLAN
49 - GRE TCP
50
51  See `Supported GSO Packet Types`_ for further details.
52
53Packet Segmentation
54-------------------
55
56The ``rte_gso_segment()`` function is the GSO library's primary
57segmentation API.
58
59Before performing segmentation, an application must create a GSO context object
60``(struct rte_gso_ctx)``, which provides the library with some of the
61information required to understand how the packet should be segmented. Refer to
62`How to Segment a Packet`_ for additional details on same. Once the GSO context
63has been created, and populated, the application can then use the
64``rte_gso_segment()`` function to segment packets.
65
66The GSO library typically stores each segment that it creates in two parts: the
67first part contains a copy of the original packet's headers, while the second
68part contains a pointer to an offset within the original packet. This mechanism
69is explained in more detail in `GSO Output Segment Format`_.
70
71The GSO library supports both single- and multi-segment input mbufs.
72
73GSO Output Segment Format
74~~~~~~~~~~~~~~~~~~~~~~~~~
75To reduce the number of expensive memcpy operations required when segmenting a
76packet, the GSO library typically stores each segment that it creates as a
77two-part mbuf (technically, this is termed a 'two-segment' mbuf; however, since
78the elements produced by the API are also called 'segments', for clarity the
79term 'part' is used here instead).
80
81The first part of each output segment is a direct mbuf and contains a copy of
82the original packet's headers, which must be prepended to each output segment.
83These headers are copied from the original packet into each output segment.
84
85The second part of each output segment, represents a section of data from the
86original packet, i.e. a data segment. Rather than copy the data directly from
87the original packet into the output segment (which would impact performance
88considerably), the second part of each output segment is an indirect mbuf,
89which contains no actual data, but simply points to an offset within the
90original packet.
91
92The combination of the 'header' segment and the 'data' segment constitutes a
93single logical output GSO segment of the original packet. This is illustrated
94in :numref:`figure_gso-output-segment-format`.
95
96.. _figure_gso-output-segment-format:
97
98.. figure:: img/gso-output-segment-format.*
99   :align: center
100
101   Two-part GSO output segment
102
103In one situation, the output segment may contain additional 'data' segments.
104This only occurs when:
105
106- the input packet on which GSO is to be performed is represented by a
107  multi-segment mbuf.
108
109- the output segment is required to contain data that spans the boundaries
110  between segments of the input multi-segment mbuf.
111
112The GSO library traverses each segment of the input packet, and produces
113numerous output segments; for optimal performance, the number of output
114segments is kept to a minimum. Consequently, the GSO library maximizes the
115amount of data contained within each output segment; i.e. each output segment
116``segsz`` bytes of data. The only exception to this is in the case of the very
117final output segment; if ``pkt_len`` % ``segsz``, then the final segment is
118smaller than the rest.
119
120In order for an output segment to meet its MSS, it may need to include data from
121multiple input segments. Due to the nature of indirect mbufs (each indirect mbuf
122can point to only one direct mbuf), the solution here is to add another indirect
123mbuf to the output segment; this additional segment then points to the next
124input segment. If necessary, this chaining process is repeated, until the sum of
125all of the data 'contained' in the output segment reaches ``segsz``. This
126ensures that the amount of data contained within each output segment is uniform,
127with the possible exception of the last segment, as previously described.
128
129:numref:`figure_gso-three-seg-mbuf` illustrates an example of a three-part
130output segment. In this example, the output segment needs to include data from
131the end of one input segment, and the beginning of another. To achieve this,
132an additional indirect mbuf is chained to the second part of the output segment,
133and is attached to the next input segment (i.e. it points to the data in the
134next input segment).
135
136.. _figure_gso-three-seg-mbuf:
137
138.. figure:: img/gso-three-seg-mbuf.*
139   :align: center
140
141   Three-part GSO output segment
142
143Supported GSO Packet Types
144--------------------------
145
146TCP/IPv4 GSO
147~~~~~~~~~~~~
148TCP/IPv4 GSO supports segmentation of suitably large TCP/IPv4 packets, which
149may also contain an optional VLAN tag.
150
151UDP/IPv4 GSO
152~~~~~~~~~~~~
153UDP/IPv4 GSO supports segmentation of suitably large UDP/IPv4 packets, which
154may also contain an optional VLAN tag. UDP GSO is the same as IP fragmentation.
155Specifically, UDP GSO treats the UDP header as a part of the payload and
156does not modify it during segmentation. Therefore, after UDP GSO, only the
157first output packet has the original UDP header, and others just have l2
158and l3 headers.
159
160VXLAN IPv4 GSO
161~~~~~~~~~~~~~~
162VXLAN packets GSO supports segmentation of suitably large VXLAN packets,
163which contain an outer IPv4 header, inner TCP/IPv4 or UDP/IPv4 headers, and
164optional inner and/or outer VLAN tag(s).
165
166GRE TCP/IPv4 GSO
167~~~~~~~~~~~~~~~~
168GRE GSO supports segmentation of suitably large GRE packets, which contain
169an outer IPv4 header, inner TCP/IPv4 headers, and an optional VLAN tag.
170
171How to Segment a Packet
172-----------------------
173
174To segment an outgoing packet, an application must:
175
176#. First create a GSO context ``(struct rte_gso_ctx)``; this contains:
177
178   - a pointer to the mbuf pool for allocating the direct buffers, which are
179     used to store the GSO segments' packet headers.
180
181   - a pointer to the mbuf pool for allocating indirect buffers, which are
182     used to locate GSO segments' packet payloads.
183
184     .. note::
185
186       An application may use the same pool for both direct and indirect
187       buffers. However, since indirect mbufs simply store a pointer, the
188       application may reduce its memory consumption by creating a separate memory
189       pool, containing smaller elements, for the indirect pool.
190
191
192   - the size of each output segment, including packet headers and payload,
193     measured in bytes.
194
195   - the bit mask of required GSO types. The GSO library uses the same macros as
196     those that describe a physical device's TX offloading capabilities (i.e.
197     ``RTE_ETH_TX_OFFLOAD_*_TSO``) for gso_types. For example, if an application
198     wants to segment TCP/IPv4 packets, it should set gso_types to
199     ``RTE_ETH_TX_OFFLOAD_TCP_TSO``. The only other supported values currently
200     supported for gso_types are ``RTE_ETH_TX_OFFLOAD_VXLAN_TNL_TSO``, and
201     ``RTE_ETH_TX_OFFLOAD_GRE_TNL_TSO``; a combination of these macros is also
202     allowed.
203
204   - a flag, that indicates whether the IPv4 headers of output segments should
205     contain fixed or incremental ID values.
206
207#. Set the appropriate ol_flags in the mbuf.
208
209   - The GSO library use the value of an mbuf's ``ol_flags`` attribute to
210     determine how a packet should be segmented. It is the application's
211     responsibility to ensure that these flags are set.
212
213   - For example, in order to segment TCP/IPv4 packets, the application should
214     add the ``RTE_MBUF_F_TX_IPV4`` and ``RTE_MBUF_F_TX_TCP_SEG`` flags to the mbuf's
215     ol_flags.
216
217   - If checksum calculation in hardware is required, the application should
218     also add the ``RTE_MBUF_F_TX_TCP_CKSUM`` and ``RTE_MBUF_F_TX_IP_CKSUM`` flags.
219
220#. Check if the packet should be processed. Packets with one of the
221   following properties are not processed and are returned immediately:
222
223   - Packet length is less than ``segsz`` (i.e. GSO is not required).
224
225   - Packet type is not supported by GSO library (see
226     `Supported GSO Packet Types`_).
227
228   - Application has not enabled GSO support for the packet type.
229
230   - Packet's ol_flags have been incorrectly set.
231
232#. Allocate space in which to store the output GSO segments. If the amount of
233   space allocated by the application is insufficient, segmentation will fail.
234
235#. Invoke the GSO segmentation API, ``rte_gso_segment()``.
236
237#. Call ``rte_pktmbuf_free()`` to free mbuf ``rte_gso_segment()`` segments.
238
239#. If required, update the L3 and L4 checksums of the newly-created segments.
240   For tunneled packets, the outer IPv4 headers' checksums should also be
241   updated. Alternatively, the application may offload checksum calculation
242   to HW.
243